Supplementary Figure 1 - Nature

1 downloads 0 Views 8MB Size Report
RREB1 F2RL3 UGT1A1 KDM3B HKDC1 C20orf151 ATP2C2. BAZ2B LTF BTLA KLHL3 HIVEP2 TAX1BP1 RP1 ASS1 C11orf35 PRR13 GPR65 GEMIN4 ...
b

Required coverage (10E02)

5

1* / 8 1* / 4 1* / 3 1* / 2 1* / 1

4

3

2

25

Required coverage (10E02)

a

M3 M2 M1 P3 P1

20

15

10

5

1

0

0 0

20

40

60

Cancer cell fraction (%)

80

100

0

20

40

60

80

100

Subclonal cell fraction (%)

Supplementary Figure 1: Power calculations for samples of patient 8/82. (a) Required depth of sequencing coverage as a function of CCF (%) and CNA in the worst-case scenario of one mutated copy (1∗ ) to reach a statistical power of 95% given a sequencing error rate e = 1.59E-02 and a FPR = 5E-07 to index a fully clonal SNV. The black vertical lines indicate the CCF of the samples as inferred from the respective SNP arrays. (b) Same as in (a) as a function of subclonal cell fraction (%).

M2 M3 4 M1 3 M4

TDRKH C2orf71 EPHB1 SORCS2 PROM1 ADAM19 NFE2L3 TSPAN12 IDI2 IDI2 SCYL1 ADCY7 TP53 HEATR6 RREB1 F2RL3 UGT1A1 KDM3B HKDC1 C20orf151 ATP2C2 AGFG2 BAZ2B LTF BTLA KLHL3 HIVEP2 TAX1BP1 RP1 ASS1 C11orf35 PRR13 GPR65 GEMIN4 ZNF256 PIGT LIMK2 ID3 NAV1 RETSAT LCT SLC1A3 ABCB5 SRI KCNH2 SCYL1 PRB4 ITGAE TRIM47 IZUMO4 ADAMTS5 HAO2 ZNF697 FCRL3 GORAB SLC9A11 SERTAD4 VWA3B FAM123C KCNH7 PAX3 KIF15 PRKCI CTNNA1 LGI3 KANK1 NFX1 CACNA1B PSMC3 BBS1 VSIG10 PDX1 IGDCC4 ZNF646 EXOC3L1 MYBBP1A NOS2 TOP2A METTL4 ALDH16A1 SLC25A14 BEGAIN BCAN CEBPZ ANKRD31 SMOC2 RECQL TBC1D21 ZW10 SMARCAD1 GRK6 HIST1H3B FAM75D1 LRRTM3

a

2 P2

P1 1 Normal 1

2

3

4

5

6/7

8

9

Key Truncal clonal

Shared/private clonal

Subclonal

Reversion

Absent

b F2RL3

M1 (peritoneum)

F2RL3

M2 (adrenal gland)

M3 (liver)

4

M2 (adrenal gland)

M1 (peritoneum)

3

4

4 AA

2

AA

AAB AA

2

A

A

2 P1 & 2 (primary at surgery)

0

10

20

30

40

50

60

0

10

20

Position (Mb)

F2RL3

30

40

50

60

0

Common ancestor Primary BC ER+/PgR-/HER2-

1

1

Position (Mb)

M3 (liver)

F2RL3

Normal breast

M4 (skin)

2/2/1*

4

4 AB

AA

4

AB

AA A

3/2/1* P1 & 2 (primary at surgery)

10

20

30 Position (Mb)

40

50

60

3

M1 (peritoneum)

2/2/0*

0

10

20

30 Position (Mb)

40

50

2

3/2/1* M4 (skin)

N/A

2/1/1*

0

3/2/1*

AAB

2

A

M3 (liver)

3/2/1*

M2 (adrenal gland)

3/2/1*

2

M4 (skin)

N/A

1

Common ancestor Primary BC ER+/PgR-/HER2-

60

2/1/0*

Normal breast

Supplementary Figure 2: Reversions in SNVs are explained by underlying CNAs. (a) Ancestral state reconstruction of tier-3 SNVs for patient 9/68 where ’early’ mutations are coloured blue and ’late’ mutations are orange. Back mutations are shown in grey. (b) CNA profiles of chromosome 19 for the corresponding metastases where the tracks displayed are in descending order, the BAF, CN estimate and the Log2 ratios. The genomic coordinate of F2RL3 is indicated by the vertical grey line. The evolution of the character states i.e. SNV and CN are tracked across the inferred phylogenetic tree on the right

Sequencing run No. 2

40 30 20 10 ρ = 0.985

(P = 1.91E−52)

0 0

10

20

30

40

50

Sequencing run No. 1

Supplementary Figure 3: Confirmation of the accuracy of VAFs for SNVs. Comparison of the VAFs (%) of SNVs indexed in the metastasis to the adrenal gland of patient 2/57 obtained in two sequencing runs.

FUBP1 RANBP2 GLB1L TRPM8 ALS2CL CD96 PIGZ DCUN1D4 RAPGEF6 FGFR4 F13A1 NUP205 SPTAN1 OR5B12 LRRC23 C12orf70 NCKAP1L NYNRIN BCMO1 KRT38 PRKD2 HRH3 CCNF USP34 HSPB3 CD248 NLRC3 LRRC50 KIF4A ABCA3 TUBGCP5 DST TM7SF2 NPAS4 DMXL2 TMEM132E TESK2 DYSF IQCG HTT PCDH10 FOXI1 FOXI1 GRM3 ZWINT F2 SPTB ZNF469 DHX34 EIF2S2 RIPK4 C1orf114 YTHDC1 LRWD1 FAM71F1 PODXL ANGPT1 C9orf103 STK33 SLC1A2 ALX4 SPI1 MS4A6A FMNL3 DGKA GSX1 RABEP2 TEX19 ZNF418 ARX BAI2 GEN1 GTPBP8 DDX60L ANKRD33B WWC1 FAM110B FAM75C2 MMP17 KRTAP3−1 SLC16A6 REXO1 DOPEY2 KCNAB1 BMP2K

a

M2 M3

Ancestral state

M1

Metastatic precursor

P2

Common ancestor

Normal 1

2

3

b

4

c

5

17q12 amp (ERBB2) 13q12.3 del (BRCA2wt) 8q amp (MYC) 11q13 amp (CCND1)

BRCA2 mt

P2 (primary at surgery)

P2 (primary at surgery)

2

M1 (liver)

Metastatic precursor

3

1

M2 (liver) M3 (bone)

1.0

0.8

0.8

1

0.4 0.2

56

2

0.6

356

1

0.2

0.4

0.6

0.8

0.4

1.0

4

2

0.2

0.4

0.6

0.8

1.0

0.0

2

0.2

4

5

0.6 0.4

4

0.4

0.4

0.6

0.8

1.0

6

0.6 0.4

0.6

0.8

1.0

0.8

1.0

135

24

0.2

0.0 0.2

2

0.2

0.8

13

26

0.2

0.0 0.0

46

P2

M2

0.4

1

1.0

0.8

13 M3

M2

0.2

1.0

56

0.4

35

P2

1.0

0.6

0.6

0.0 0.0

P2

0.8

M1 (liver)

0.8

0.0 0.0

M2 (liver)

1.0

0.2

0.0

6

Common ancestor primary BC ER+/PgR+/HER2+

M3

0.6

34

M2

M1

1.0

M3 (bone)

5

4

Normal breast

d

6

0.0 0.0

0.2

M1

0.4

0.6 M1

0.8

1.0

0.0

0.2

0.4

0.6 M3

Key Truncal clonal

Shared/private clonal

Subclonal

Reversion

Absent

Supplementary Figure 4: Phylogenetic reconstruction of breast cancer progression in patient 5/87. (a) Ancestral state reconstruction of tier-3 SNVs from the primary tumour and three distant lesions of patient 5/87 with anatomic location is shown in (b). (c) Combined phylogenetic tree obtained from CNAs and SNVs and (d) pairwise comparisons of clonal frequencies. The branches of the phylogenetic tree are labelled 1 through 6 in (c) and the location of these mutations in pairwise comparisons is indicated by the corresponding label in (d).

M2 M3

TDRKH C2orf71 EPHB1 SORCS2 PROM1 ADAM19 NFE2L3 TSPAN12 IDI2 IDI2 SCYL1 ADCY7 TP53 HEATR6 RREB1 F2RL3 UGT1A1 KDM3B HKDC1 C20orf151 ATP2C2 AGFG2 BAZ2B LTF BTLA KLHL3 HIVEP2 TAX1BP1 RP1 ASS1 C11orf35 PRR13 GPR65 GEMIN4 ZNF256 PIGT LIMK2 ID3 NAV1 RETSAT LCT SLC1A3 ABCB5 SRI KCNH2 SCYL1 PRB4 ITGAE TRIM47 IZUMO4 ADAMTS5 HAO2 ZNF697 FCRL3 GORAB SLC9A11 SERTAD4 VWA3B FAM123C KCNH7 PAX3 KIF15 PRKCI CTNNA1 LGI3 KANK1 NFX1 CACNA1B PSMC3 BBS1 VSIG10 PDX1 IGDCC4 ZNF646 EXOC3L1 MYBBP1A NOS2 TOP2A METTL4 ALDH16A1 SLC25A14 BEGAIN BCAN CEBPZ ANKRD31 SMOC2 RECQL TBC1D21 ZW10 SMARCAD1 GRK6 HIST1H3B FAM75D1 LRRTM3

a

Ancestral state

M1

Ancestral state

M4

Metatstic precursor

P2

Ancestral state

P1

Common ancestor

Normal 1

2

3

4

b

5

6/7

8

9

c 2

M4 (skin) P1 & 2 (primary at surgery)

N/A

P1 & 2 (primary at surgery) M2 (adrenal gland)

1

3

Metastatic precursor

M2 (adrenal gland)

4

ASS1

ASS1

1* 8

0.8

135

ANKRD31

0.4

1* 8 9

0.6

TBC1D21

M3

249

TBC1D21

24

0.2

0.4

0.6

0.8

1.0

0.2

0.4

289

1.0

0.8

0.8

9

1358

M4

GPR65

2467

5*

4

0.8

1.0

Key Truncal clonal

0.2

0.4

0.2

0.8

13

0.4

0.6

0.8

1.0

0.4 2679

1* 5 8

0.2

0.4

0.6 M2

Subclonal

0.8

1.0 GPR65

13 GEMIN4

0.4 267

0.2

1* 5 8 9

0.0 0.0

M2

Shared/private clonal

0.6

4 5*

0.6

0.0 0.0

0.0

M1

GEMIN4

0.2

0.0

1* 5 6

1.0

0.6

0.4 0.2

0.6 M1

M1

0.6

ANKRD31

0.4

0.0 0.0

1.0

13 GEMIN4

0.2

0.0 0.0

TBC1D21

5* 6

0.2

0.0

1* 4

0.6

ANKRD31

0.4

6

0.2

0.8

135 GPR65

M4

M2

0.6

M3

ASS1

1.0

M4

0.8

M1 (peritoneum)

M4 (skin)

1.0

1.0

M3 (liver)

7

Common ancestor Primary BC ER+/PgR-/HER2-

9

6

Normal breast

M3 (liver)

d

8

5

M1 (peritoneum)

Reversion

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

M3

Absent

Supplementary Figure 5: Phylogenetic reconstruction of breast cancer progression in patient 9/68. (a) Ancestral state reconstruction of tier-3 SNVs from the primary tumour and four distant lesions of patient 9/68 with anatomic location is shown in (b). (c) Combined phylogenetic tree obtained from CNAs and SNVs and (d) pairwise comparisons of clonal frequencies. The branches of the phylogenetic tree are labelled 1 through 9 in (c) and the location of these mutations in pairwise comparisons is indicated by the corresponding label in (d). 1∗ and 5∗ in (d) mark the location of reversions in M1, M3 and M4 that can be explained by CNAs. The clonal structure of three SNVs; ASS1p.M147L, GPR65p.N213fs, and GEMIN4p.C683W, were inconsistent with the phylogenetic reconstruction, being subclonal in M1, M3 and M4 respectively. Considering that ASS1p.M147L and GPR65p.N213fs were fully clonal in the earlier branching M4, imposes that these mutations occurred along branch 1 or 3. Since they were absent from P1 and P2, branch 3 is preferred at the expense of 1. Nonetheless, these represent only 3,2% of tier-4 SNVs and their exclusion does not influence the global topology of the phylogenetic tree. TBC1D21p.Q14K and ANKRD31p.Y1796C are uniquely subclonal in M1 and consistent with the inferred phylogeny.

DTNBP1 USP49 HECW1 PCLO LAMB1 TAS2R4 TRPV5 SMARCD3 CSMD1 C8orf86 PAPPA GATA3 UPF2 NRP1 PCDH15 KIF20B IDE TRAF6 OR4D10 PZP CPNE8 LONP2 RPL23A KRT19 MYH14 CWF19L2 ABCG1 SLC4A7 ARID1A ABCA4 RNASEL C1orf31 ADCY3 SEL1L3 GRID2 NPY2R FHAD1 TCHH HEATR5B FAM198A CDH9 PDGFRB LILRB1 APOL2 SLC26A7 MLL CCDC123 RABEP1 MKS1 MFN2 VPS72 CLK2 LOC646627 ALB C4orf49 LCA5 PLAG1 CHCHD7 PKHD1L1 CH25H PNLIP PPFIA2 ZDHHC20 OR10G3 SRSF5 CTU2 KIF18B CLEC4M ZNF329 RAD21L1 ZNF75D TMPRSS15

a

M1 M4

Ancestral state

M3

Ancestral state

P

Ancestral state

M5

Common ancestor/ Metastatic precursor

Normal 1

1*

b

2

3

c Common ancestor/ Metastatic precursor Primary BC ER+/PgR+/HER2+

M4 (mediastinal lymph node) M1 (mediastinal soft tissue)

M5 (pleura)

1

M3 (aorta)

2

M5 (pleura) P (primary)

M4 (mediastinal lymph node)

3

M1 (mediastinal soft tissue)

P (primary at autopsy) M3 (aorta)

Normal breast

d

1

0.4 0.2 0.0

0.2

0.4

P

0.6

0.8

0.0

1.0

M5

M5

0.2

0.4

P

0.6

0.8

1.0

1*

0.4

2

1*

0.0

0.2

0.4 0.6 M1

0.8

1*

0.4 1*

0.2 1.0

1.0

0.2

0.4

P

0.6

0.0

0.2

2

0.8

0.0

1.0

0.4 0.6 M3

1*

0.4 1*

0.2 0.8

1.0

0.8

1.0

2 1*

0.0

0.2

0.8

1

0.6

0.0

0.4

0.4

P

0.6

0.8

1.0

0.8

1.0

1.0

0.8

1

0.6

0.0

0.0

1*

0.6

0.2

1*

1.0

0.8

1

0.6

0.2

0.0

0.2

0.4 0.6 M4

2

1*

0.6 0.4 0.2

0.8

1.0

12

0.0

1*

0.0

0.2

0.4 0.6 M3

1.0

0.8

0.8

12

0.6 0.4

M4

M3

0.0

1*

0.4

0.0

0.8

12

0.6

0.2

1.0

0.8

1* 1*

0.2 0.0

1*

M5

1.0

0.0

0.4 0.2

1*

0.8

12

1*

0.6

M4

1*

1.0

M5

0.8

12

0.6

M3

M1

0.8

0.0

1.0

1.0

M4

1.0

0.0

0.2

0.4 0.6 M1

12

0.6 0.4

1*

0.2 0.8

1.0

0.0

1*

0.0

0.2

0.4 0.6 M1

Key Truncal clonal

Shared/private clonal

Subclonal

Absent

Supplementary Figure 6: Phylogenetic reconstruction of breast cancer progression in patient 1/69. (a) Ancestral state reconstruction of tier-3 SNVs from the primary tumour and four distant lesions of patient 1/69 with anatomic location is shown in (b). (c) Combined phylogenetic tree obtained from CNAs and SNVs and (d) pairwise comparisons of clonal frequencies. The branches of the phylogenetic tree are labelled 1 through 3 in (c) and the location of these mutations in pairwise comparisons is indicated by the corresponding label in (d).

Brain n = 1 (1.6%)

1/69

7/67 Normal

Normal P

Skin n = 1 (1.6%) Mediastinal lymph node n = 2 (3.1%) Hilar lymph node n = 1 (1.6%) Primary n = 28 (43.8%) Local recurrence n = 1 (1.6%) Lung n = 1 (1.6%) Pleura

n = 2 (3.1%)

Kidney n = 1 (1.6%)

Mediastinal soft tissue n = 1 (1.6%)

M1

M2

M5

M3

M2

2/57

8/82 Normal

Contralateral breast n = 2 (3.1%)

Normal P1

P

P2 P3

M1

Pylorus n = 1 (1.6%) Adrenal gland n = 2 (3.1%)

M2

M2

M4

M3

4/71

M3

9/68 Normal

Pancreas n = 1 (1.6%)

Normal P1

P1

M2

M2

Spleen n = 1 (1.6%) Liver n = 11 (17.2%)

M1

M4

Peritoneum n = 1 (1.6%) Aorta n = 1 (1.6%)

P

M3

M1 M3

M1

M4

P2

Bone n = 2 (3.1%)

5/87

P2

10/80 Normal

Normal

P1

Uterus n = 1 (1.6%)

P2 P4 P3 M2

Ovarium n = 2 (3.1%)

P1 P3 M1

M3 M1

M3

6/91 Normal P1 P2 P4 P3 M1 M2 M3

Supplementary Figure 7: Phylogenetic trees obtained from the intermediatelevel of tier-3 SNVs using Dollo parsimony. The left most panel provides an anatomic representation of the distribution of tumour deposits whilst the phylogenetic trees inferred from samples with >1500X coverage and >3% VAF are shown on the right. The scale bars at the bottom represent 10 SNVs and give an indication of the total length of the trees.

1/69

2/57

P

M5

M1 M2 M3

M2

M1

P1 P3 M2 M4 M3

M1

M1

M2

10/80

9/68 Normal

Normal

M3

M3

8/82

P

M2

M2

M4

6/91 Normal P2

M1

M1

Normal

5/87 Normal

P

M3

7/67

4/71 Normal

Normal

Normal

Normal

M2 M1

M1

M3 M4

M3

Supplementary Figure 8: Phylogenetic trees obtained from the top-level of tier-4 SNVs using Dollo parsimony. The phylogenetic trees were inferred from samples with >30% CCF, >1500X coverage and >3% VAF. For two patients (4/71 and 6/91), the primary samples did not pass this QC criterion and were not considered further. The scale bars at the bottom represent 10 SNVs and give an indication of the total length of the trees.

1/69

2/57

Diploid

Diploid

M1

41

P

M1

100

81

M2 25

28

M1 52 100

100

M3 P2

9/68

100

M3 M2

M1 43

65

P1

38

P

Diploid

M2

86

31

10/80

Diploid

P3

M1

M1

M2

M1

Diploid

M2 M3

83 67

8/82

Diploid

M2

100

P5

7/67

Diploid

M2

P6

M3

6/91

M1 M2

M1

P1

22

22

M5

Diploid

P4

36

P

M3

Diploid

P3 34

97

5/87

P2

61

M4

4/71

Diploid 100

46 100

3/92

100

100

M1 M3 M4

100

99

M2 44

M3 M4 P2

Supplementary Figure 9: Phylogenetic trees obtained from CNAs using MEDICC. The phylogenetic trees were inferred from samples with >30% CCF. For two patients (4/71 and 6/91), the primary samples did not pass this QC criterion and were not considered further. The scale bars at the bottom represent 10 CNAs and provide an indication of the total length of the trees whilst the number besides each node represents the percentage of trees supporting this split in 100 resampling of the distance matrix with added Gaussian noise.

b

60 40

ρ = 0.858

(P = 1.56E−12)

20

ρ = 0.816

(P = 7.87E−11)

20

d

40

60

GAP

80

3 2

ABSOLUTE ASCAT

1

100

50

2

3

4

5

6

M1 (liver)

Log likelihood

0.4

GAP

P (primary)

(P = 1.83E−15)

0.6

M2 (adrenal gland)

0

M3 (ovarium)

0.0

−50

0.0

0.5

1.0

1.5

2.0

20

Absolute difference in ploidy

f

M1 (liver)

M2 (adrenal gland)

8

g Targeted sequencing

10

M3 (ovarium)

6 4 2 0

20

40

60

80

100

Cancer cell fraction (%)

60

80

κ

0.6

0.8

1.0

P (primary)

0.0

0.2

0.4

0.6

0.8

κ

1.0

Putative driver tier-4 SNVs All tier-4 SNVs

60 40

ρ = 0.884

20

40

60

20 0

SNP array (GAP)

100

20

40

60

XXXX X X X X XX X

0

20

X

80

100

0

80

100

X

20

40

60

80

100

X XX XX X XXXXXX XXXX XX X XX

40 20 0

60

X XX XXXXX XXXXX XXXXX X XXX X X

3 copies 2 copies 1 copy

60 X

X XXX XX

M3 (ovarium)

100

X

X X

X X X X

80

X X XXX XX XXXXXXXX XXXXXXXXX X X XXXX X X X

40

X

40 20

3 copies 2 copies 1 copy

60

0

60

0

80

20

X

M2 (adrenal gland)

100

(P = 3.49E−11)

X X

XXXX XXXXX XXXX XXX X X X

X X X

0

40

80

XX

3 copies 2 copies 1 copy

80 X XXX XX XX

40

(P = 9.87E−10) ρ = 0.912

y=x

60

M1 (liver)

100

3 copies 2 copies 1 copy

80

100

80

20

0

40

Cancer cell fraction (%) 100

P (primary)

0.4

100

−150

0.0

0.2

h

−100

0.2

Frequency (%)

4

1

ρ = −0.898

0.8

κ

ρ = 0.991 (P = 4.71E−35)

5

e

1.0

Major allele

Total co py numbe rs

y=x

ρ = 0.56 (P = 0.00014)

Variant allele fraction (%)

ASCAT / ABSOLUTE

80

c

Ploidy

6

y=x

ABSOLUTE ASCAT

Samples

Cancer cell fraction (%) 100

ASC AT / ABSOLUTE

a

X

XX

X

X

X XX X

0

20

40

60

80

100

Cancer cell fraction (%)

Supplementary Figure 10: Comparison of CCF and ploidy estimates from ASCAT, ABSOLUTE and GAP. (a) Correlation of CCF obtained from ASCAT and ABSOLUTE against the estimate obtained using GAP, (b) same as in (a) for ploidy estimates. In all cases, a high correlation was obtained. However, this high correlation is misleading as there is a large discrepancy in the estimation of total copies and major allele at the level of individual samples. (c) Distribution of κ coefficients obtained by computing a contingency table of either total copies or major alleles for each sample using ASCAT and GAP. In both algorithms, the choice of CCF and ploidy are tightly linked. However, the major source of discrepancy stems from wrong ploidy estimation. This is shown in (d) as the inverse correlation of the κ coefficient with the absolute difference in ploidy estimate between the two algorithms. In extreme cases, we observed differences in ploidy across the matched samples from the same patient. Because GAP allowed for review of ploidy estimates, we opted for this particular algorithm for downstream analyses. Provided the underlying copy number and major allele count are known, the CCF can also be estimated from the VAF of SNVs. (e) Maximization of the Loglikelihood over all tier-4 SNVs of samples from patient 2/57. At the global value of CCF that maximizes this objective function, the CCFs of individual tier-4 SNVs can be represented as a histogram, which provides a rough assessment of clonality. This is shown in (f) and (h) for the same four samples. The correlation of CCF estimated through maximization of the Log-likelihood over all tier-4 SNVs of the CCF of particular tier-4 SNVs with the estimate obtained from the SNP arrays using GAP is shown in (g).

a

b

Ploidy

ASCAT

3.0

60 40 20

2.0

ρ = 0.885 (P = 6.91E−52)

1.5 1.5

2.0

2.5

3.0

c

3.5

4.0

4.5

ρ = 0.907 (P = 1.64E−58)

0

5.0

0

20

1.0

60

80

100

GAP 1.0

ρ = −0.733 (P = 2.07E−26)

0.8

ρ = −0.119 (P = 0.147)

0.8 0.6

0.6

κ

0.4

0.4 0.2

0.2

0.0

0.0 0.0

0.5

1.0

0

1.5

10

Absolute difference in ploidy

f AAAB

AAAA AAA

0.5

AABB 4/2

4/3

AAB

AA

AB

3/3

2/1

0.0

30

0.5

AAAA

AAAB AABB 4/2

AAA

4/3

AAB

AA 1/1

A

3/3

AB

2/1

0.0

2/2

0.4

0.6

0.8

g

0.2

0.4

0.6

0.8

1.0



40

40

10 2

3

4

Ploidy

5

6

1

2

3

4

Ploidy

5

6

1.0

AAAA AAA

AB

2/1

2/2

5/3

AAAB AABB 4/2 AAB

5/4 4/3

3/2

AA

4/4 3/3 2/2

−1.0 0.0

0.2

0.4

0.6

0.8

0.0

1.0

BAF

0.2

m 100

+

70

40

40

1

2

3

0.4

4

Ploidy

5

6

0.6

0.8

1.0

BAF

70

10

10 1

0.8

6/4

AAAAB AAABB

0.0

3/3

+





0.6

AAAABB

4/4

3/2

100

+ 70

4/3

AAB

k

100

+

0.4

0.5

5/3

AABB 4/2

AA

BAF

70

AAAB

−0.5

1/1

i

100

AAABB AAAA

−1.0 0.0

1.0

BAF

0.2

1.0 6/4

−0.5 A

0.2

0.0

1.0

κ

AAAABB

AAA

−1.0 0.0

0.8

l 0.5

4/4

3/2

−0.5

−1.0

0.6

j

0.0

−0.5

0.4

1.0

2/2

0.0

0.2

κ

1.0

4/4

3/2

20

Absolute difference in cancer cell fraction (%)

h

Log2 Ratio

40

d

GAP

Samples

ASCAT

3.5

2.5

Cancer cell fraction (%)

Major allele

80

4.0

1.0

Total copy numbers

100

4.5

κ

e

Cancer cell fraction (%)

5.0

10



1

2

3

4

5

6

Ploidy

Supplementary Figure 11: Comparison of CCF and ploidy estimate between ASCAT and GAP. Globally, the CCF and ploidy estimates are usually well correlated between ASCAT and GAP. This is shown in (a) and (b). However, this is only apparent and there is often a large variation between the two algorithms at the individual sample level. (e) shows the discrepancy between the two algorithms using the κ coefficient across a series of 125 samples profiled using a similar Affymetrix OncoScan FFPE Express array and subjected to the same QC criteria. (c) and (d) show that this stems mostly from a difference of ploidy estimate rather than CCF as the coefficients were more highly anticorrelated to the absolute difference in ploidy than to the absolute difference in CCF. (f) to (m) show the matched scatter plots of Log2 Ratio versus BAF (top) and heat maps of CCF versus ploidy (bottom) across a series of samples with increasing genomic mass as an illustration of this problem. + shows to the estimate of ASCAT, ● that of GAP and ∎ indicates the value obtained from ABSOLUTE.

Supplementary Note 1 Patient 1/69 case report: The patient detected a mass in her breast months before being hospitalized due to anaemia at the age of 78 years. During physical examination, a stage T4 right-sided breast tumour was detected. The patient declined any kind of treatment. Breast cancer was histologically confirmed at autopsy. Autopsy findings: The primary breast tumour infiltrated the chest wall, the right side of the neck, mediastinal soft tissue, aorta and the outer surface of the left atrium. Metastases were found in the hilar lymph node, mediastinal lymph node and on the pleura. Samples were collected from the primary tumour, aorta wall, mediastinal soft tissue, distant lymph nodes and pleura. Cause of death: advanced stage breast cancer.

Supplementary Note 2 Patient 2/59 case report: The patient was diagnosed with a 1 cm IDC (nuclear grade II, TN) in the left breast at the age of 38. Staging CT scan showed metastases in the lung, liver, and bones. Due to very poor performance status, the patient was not a candidate for CTX or XRT. She died one month after initial diagnosis. This patient was a BRCA1 germline mutation carrier. Autopsy findings: The primary tumour was present in the left breast. Metastases were found in the left axillary lymph nodes, hilar lymph nodes, pleura, lung, liver, left adrenal gland, ovarium and bones. Samples were collected from the breast, liver, ovarium and adrenal gland. Cause of death: fulminant liver failure due to liver metastases.

Supplementary Note 3 Patient 3/92 case report: The patient’s grandmother on father’s side and mother had breast cancer. At the age of 39, during her first pregnancy, she discovered a lump in the right breast; an initial biopsy was negative. Three months later, a biopsy was repeated and revealed an IDC (grade III, TN, Ki67 90%) of 6.5 cm. PET CT scan showed one metastasis in the liver. The patient received six cycles of preoperative TXT-CBP which, based on imaging, resulted in complete response in the liver and axilla, and partial response in the primary tumour. Mastectomy and axillary dissection was performed and revealed an IDC, grade III, TN, Ki67 1%, 40 mm tumour with 1/13 LNs involved (ypT2ypN1). During XRT, the patient had epigastrial symptoms and ultrasound showed multiple liver metastases. Due to her performance status, she was not a candidate for chemotherapy. The patient died 11 months after initial diagnosis. Autopsy findings: Metastases were found in the liver, distal lymph nodes and on the pleura (samples from the liver and distal lymph nodes). Cause of death: advanced stage breast cancer.

Supplementary Note 4 Patient 4/71 case report: The patient underwent a lumpectomy and axillary dissection for a left side breast tumour at the age of 54. Histological assessment revealed a 2 cm IDC (grade II, TN, Ki67 5%) with one out of seven lymph nodes involved (pT2pN1). The patient received adjuvant systemic treatment of four cycles of FEC. Six months later, the patient underwent mastectomy for a local recurrence. Histological assessment confirmed a recurrence of grade II, TN IDC with muscle involvement and two positive lymph nodes. The patient received five cycles of gemcitabine + docetaxel. One year after initial diagnosis, PET CT scan showed multiple skeletal and liver metastases. Ibandronic acid was started and RFA of a liver metastasis was performed. The patient died 14 months following initial diagnosis. Autopsy findings: A local recurrence in the left breast and several metastases in the liver and bone (spine) were found. Samples from the liver and the local recurrence were collected. Cause of death: fulminant liver failure due to liver metastases.

Supplementary Note 5 Patient 5/87 case report: The patient’s mother had a BRCA2 germline breast cancer. This mutation was also identified in the patient. At the age of 29, the patient discovered a lump in the right breast. A core biopsy revealed an IDC (grade III, ER+, PgR+, Ki67 20%) associated with high grade DCIS. At diagnosis, HER2/neu status was negative using IHC. In our study, we evaluated centrally the HER2/neu status using FISH. Staging CT-scan detected metastasis in the liver. The patient received a 1st line chemotherapy consisting of three cycles of FAC, with good tumour response in the breast. Mastectomy and axillary dissection revealed an IDC (grade III, ER+, PgR+) with four positive lymph nodes out of 11 (ypT2pN2a). GNRH analogue and letrozole was initiated. PET CT scan showed disease progression in the liver. A second line of chemotherapy with capecitabine and bevacizumab was administered for four months. Eleven months after the initial diagnosis, a disease progression was documented in the liver including the detection of new infradiaphragmatic lymph nodes and bone lesions. A third line of chemotherapy (eight cycles of paclitaxel carboplatin) was administered; and the PET CT scan showed a total metabolic remission in the liver and the bones. At 20 months after diagnosis, a PET CT scan showed disease progression with bone metastases. Zoledronic acid was switched to ibandronate. The patient declined additional chemotherapy. Endocrine therapy and bisphosphonates were given. The patient died 26 months after initial diagnosis. Autopsy findings: Metastases were found in the liver, lung, pleura, skin, distant lymph nodes, brain and bone. Samples were taken from the liver, lung, brain and bone. Cause of death: hepatorenal syndrome due to disease progression.

Supplementary Note 6 Patient 6/91 case report: The patient’s father had bone sarcoma and died at age 28, grandfathers had lung and gastric cancer, aunt had lung cancer. At the age of 38, the patient detected a tumour in the right breast. A core biopsy revealed an IDC (TN, Ki67 10%) with positive axillary lymph node. The PET CT scan did not show any distant metastases. After two cycles of neoadjuvant FEC, the patient discontinued her treatment. At month 12, she returned to the clinic and had a mastectomy and axillary dissection for an ulcerated breast tumour. The histology demonstrated a 60 mm IDC (grade III, TN) with 16 positive lymph nodes out of 16 (ypT4ypN2a). During surgery, a tumour was detected in the contralateral breast leading to a lumpectomy. Histology revealed the presence of a 13 mm IDC associated with ILC (grade III, ER Q-score 3, PgR Q-score 5, Ki67 25%), with one positive lymph node out of two. A PET CT scan detected bone metastases. Bisphosphonates were started. The patient received seven cycles of taxotere-carboplatin as first line chemotherapy which was switched to capecitabine (six cycles) because of allergic reaction to carboplatin. After new skin and liver lesions were detected, the patient died 26 months after initial diagnosis. Autopsy findings: Metastases were found on the chest wall, skin, in the liver, brain and bones. Samples were obtained from the brain and liver. Cause of death: brain oedema and herniation due to brain metastases.

Supplementary Note 7 Patient 7/67 case report: The patient’s mother had breast cancer and father had lung cancer. She underwent lumpectomy and axillary dissection for a tumour of the right breast at the age of 54. Histological assessment revealed a 1.8 cm IDC (grade III, TN, Ki67 4%) without lymph node involvement (0/15); stage pT1cpNo. The patient received adjuvant chemotherapy with four cycles of doxorubicin plus cyclophosphamide followed by radiotherapy of the right breast. Eighteen months after the initial diagnosis, the patient had an epileptiform seizure. MRI showed a brain metastasis. After surgery and whole brain radiation, five cycles of paclitaxel and carboplatin were given. Twelve months later, new lesions were detected in the brain. The patient declined chemotherapy and died 44 months after initial diagnosis. Autopsy findings: Metastases were found in the brain, lung, kidney, ovarium and spleen. Samples were collected from the lung, spleen, ovarium and kidney. Cause of death: brain oedema and herniation due to brain metastases.

Supplementary Note 8 Patient 8/82 case report: The patient underwent mastectomy and axillary dissection for a right breast tumour at the age of 62. Histological assessment revealed a pleiomor-

phic ILC+LCIS (grade III, ER+ (Q-score 7), PgR-, HER2-, Ki67 6%) with five positive lymph nodes out of seven (pT2 pN1a). The patient received adjuvant chemotherapy with six cycles of FEC followed by XRT and endocrine therapy (exemestane for 16 months followed by letrozole for 19 months). Forty-three months after the initial diagnosis, liver metastases were detected. Radio frequency ablation was performed and endocrine treatment was switched back to exemestane. Disease progression in the liver was documented 16 months later. The patient started fulvestrant for three months. At 61 months after diagnosis, paclitaxel and carboplatin were started due to disease progression. After four cycles, CT scan showed progression in the liver and the apparition of new bone metastases. She received 14 cycles of capecitabine. The patient died 76 months after initial diagnosis. Autopsy findings: Metastases were found in the mediastinal lymph nodes, pleura, lumbar vertebras, right adrenal gland, liver, rectum, uterus, pylorus, peritoneum and retroperitoneum. Samples were collected from the pleura, pylorus, liver and uterus. Cause of death: advanced stage breast cancer.

Supplementary Note 9 Patient 9/68 case report: The patient underwent mastectomy and axillary dissection for a right-side breast tumour at the age of 68. Histological assessment revealed a 3.8 cm IDC (grade III, ER Q-score 3 and PgR Q-score 2, HER2-, Ki67 5%) without lymph node involvement (0 of 20); stage pT2pNo. The patient received adjuvant endocrine therapy with letrozole for five years. Eighty months after initial diagnosis, a metastasis was found in the sternum. Zoledronic acid was started. Six months later, therapy was switched to ibandronic acid and fulvestrant was initiated as the bone scan detected a new lesion in the femur. Palliative RT was performed for the bone lesions. Seven months later, a CT scan showed multiple metastases in the liver and lung. The patient declined CTX and died 109 months after initial diagnosis. Autopsy findings: Metastases were found in the liver, lung, adrenal glands, bones, skin and on the peritoneum. Samples were collected from the skin, adrenal gland, liver and peritoneum. Cause of death: cardiorespiratory insufficiency (left ventricle dysfunction, two-sided lung oedema and hydrothorax) due to disease progression.

Supplementary Note 10 Patient 10/80 case report: The patient underwent a lumpectomy and axillary dissection for a right-side breast tumour at age 46. Histological assessment revealed a 2.4 cm IDC, grade III, ER Q-score 4, PgR Q-score 7, Ki67 1% with one positive lymph node out of five (pT2pN1). At diagnosis, the HER2/neu status was negative using IHC. In our study, we repeated the HER2/neu staining using IHC and FISH which demonstrated HER2/neu amplification. The patient received adjuvant chemotherapy, six cycles of

FAC followed by tamoxifen. 64 months after initial diagnosis, a local recurrence was diagnosed. The patient declined surgery and tamoxifen was switched to letrozole. XRT was given. Two years later, metastases were found in the liver. Capecitabine was started (eleven cycles with stable disease). At 111 months, progression was detected in the liver and the breast with new mediastinal lymph nodes metastases. Three cycles of gemcitabine + docetaxel was given which resulted in regression in the breast but progression was detected in the liver and new bone metastases appeared. Mastectomy was performed 115 months after diagnosis (histology: low differentiated IDC). Ibandronic acid was started. Patient received four cycles of gemcitabine + docetaxel. At 119 months, a local recurrence was detected on the right chest wall and RT was given. After finding a new lesion in the contralateral breast, the patient received one cycle of carboplatin. The patient died 123 months after initial diagnosis. Autopsy findings: Metastases were found in the liver, vertebras, mediastinal LNs and local recurrence on the right chest wall, contralateral breast tumour. Samples were collected from the contralateral breast tumour, mediastinal lymph node, liver and bone. Cause of death: liver insufficiency. CT: Computed Tomography CTX: Chemotherapy DCIS: Ductal Carcinoma in situ FAC: Fluorouracil + Doxorubicin + Cyclophosphamide FEC: Fluorouracil + Epirubicin + Cyclophosphamide FISH: Fluorescence in situ hybridization IDC: Invasive Ductal Carcinoma IHC: Immunohistochemistry ILC: Invasive Lobular Carcinoma LCIS: Lobular Carcinoma in situ PET: Positron Emission Tomography RFA: Radiofrequency Ablation TN: Triple Negative XRT: Radiation Therapy