Diagnostic accuracy: theoretical models for

0 downloads 0 Views 224KB Size Report
Jul 30, 2010 - proportion of normal diploid nuclei with a normal signal pattern. The primary ... sporadic chromosome aneuploidy, triploidy and complex (chaotic) mosaicism associated with a reduction in reproductive efficiency. (Goossens et ...
Human Reproduction, Vol.25, No.10 pp. 2622–2628, 2010 Advanced Access publication on July 30, 2010 doi:10.1093/humrep/deq196

ORIGINAL ARTICLE Reproductive genetics

Diagnostic accuracy: theoretical models for preimplantation genetic testing of a single nucleus using the fluorescence in situ hybridization technique 1

Cytogenetics Department, GSTS-Pathology, Guy’s Hospital, 5th Floor Tower Wing, Great Maze Pond, London SE1 9RT, UK 2Centre for Preimplantation Genetic Diagnosis, Guy’s and St Thomas’, NHS Foundation Trust, London, UK 3Department of Clinical Epidemiology and Biostatistics, Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands

*

Correspondence address. E-mail: [email protected]

Submitted on April 21, 2010; resubmitted on June 20, 2010; accepted on July 5, 2010

background: The aim of this study was to develop and use theoretical models to investigate the accuracy of the fluorescence in situ hybridization (FISH) technique in testing a single nucleus from a preimplantation embryo without the complicating effect of mosaicism.

methods: Mathematical models were constructed for three different applications of FISH in preimplantation genetic testing (sex determination for sex-linked diseases, two-way reciprocal translocations and sporadic chromosome aneuploidy). The input values were the degree of aneuploidy (initially set at 3% per chromosome for sporadic aneuploidy) and the accuracy per probe (initially set at 95%), defined as the proportion of normal diploid nuclei with a normal signal pattern. The primary statistic was the predictive value of the test result. results: Testing two chromosome pairs to determine sex chromosome status or detect unbalanced translocation products had high predictive value: at least 99.5% for a normal test result (95% CI: 99– 100%), and 90% for an abnormal test result (95% CI: 88–92%). However, the predictive value of an abnormal test result testing five chromosomes for sporadic chromosome aneuploidy was 41% (95% CI: 36–46%); 90% would be achieved with an aneuploidy rate per chromosome of 20.3% (equivalent to 99.5% prevalence for 23 chromosomes) rather than 3%, or with an accuracy per probe of 99.6% rather than 95%, or when testing 23 chromosome pairs, rather than 5 pairs, with either 8.3% aneuploidy (86.4% prevalence) or 99.5% accuracy.

conclusions: Testing a single cell using the FISH technique has the potential to achieve acceptable analytical performance for sex determination and two-way reciprocal translocations, but is unlikely to achieve adequate performance testing for sporadic chromosome aneuploidy. New techniques for detecting the copy number of every chromosome are emerging, but it remains to be seen if the high accuracy required will be achieved. Key words: FISH / PGD / PGS / aneuploidy / translocation

Introduction Preimplantation genetic testing (PGT) of cleavage-stage embryos using the fluorescence in situ hybridization (FISH) technique is typically used: (i) to determine the sex chromosome complement of embryos at risk of a sex-linked disease or for social reasons (gender balancing); (ii) to detect chromosome imbalance associated with meiotic segregation products of parental chromosome rearrangements and (iii) to detect sporadic chromosome aneuploidy, triploidy and complex (chaotic) mosaicism associated with a reduction in reproductive efficiency

(Goossens et al., 2009). Testing single cells using the FISH technique has inherent technical difficulties associated with spreading and fixing a single nucleus, and is complicated by the stage of the mitotic cell cycle at the time of spreading, the variable binding efficiency of the probes and target DNA sequences, and the arbitrary nature of scoring of FISH signals in interphase nuclei. Testing is also confounded by the nature of the early embryo, where the one or few cells available may not be genetically representative of the whole embryo due to errors associated with fertilization, cell mitosis and nucleus packaging (Munne and Cohen, 1998). Comprehensive modelling and discussion

& The Author 2010. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology. All rights reserved. For Permissions, please email: [email protected]

Downloaded from http://humrep.oxfordjournals.org/ by guest on December 31, 2015

P.N. Scriven 1,2,* and P.M.M. Bossuyt 3

2623

Diagnostic accuracy of single cell FISH

of the mechanisms causing mosaicism in early embryos has been covered in detail previously (Los et al., 2004). The diagnostic accuracy concepts of sensitivity, specificity and the predictive value of abnormal (positive) and normal (negative) test results have been used for over 60 years to evaluate medical tests (Grimes and Shulz, 2002; Bossuyt, 2008). Sensitivity and specificity give a measure of the quality of the test; positive and negative predictive values give the post-test probability of being affected after a positive test or unaffected after a negative test, respectively. The aim of the study presented here was to develop and use theoretical models to investigate the accuracy of the FISH technique when applied to different applications of testing a single nucleus from a preimplantation embryo uncomplicated by mosaicism.

Materials and Methods

Box 1. Model calculations Input values n, number of chromosome pairs tested; E, accuracy per probe; A, aneuploidy rate per chromosome pair tested. Sex determination (PGD– SD) The proportion of embryos with a male or aneuploid chromosome complement: exp1 ¼ 1 2 ((1 2 A)2/2). The proportion of normal female embryos with a male or aneuploid (positive) test result for the chromosomes tested: exp2 ¼ 1 2 E 2 The proportion of male or aneuploid embryos that have a normal female (negative) test result for the chromosomes tested (see Note 2): exp3 ¼ {{E((1 2 A)2/2)((1 2 E)/2)2} + {(5AE(1 2 E) (1 2 A))/8} + {3A 2((1 2 E)/2)2} + {(3A(1 2 A)/4) ((1 2 E)/2)3} + {(A 2/4)((1 2 E)/2)4}}/{1 2 ((1 2 A)2/2)} Two-way reciprocal translocations (PGD– RT) Proportion of embryos consistent with: alternate segregation ¼ J; adjacent-1 segregation ¼ K; adjacent-2 segregation ¼ L; 3:1 segregation ¼ M; 4:0 segregation ¼ N. The proportion of embryos with an aneuploid chromosome complement: exp1 ¼ 1 2 J. The proportion of normal/balanced embryos with an abnormal (positive) test result for the chromosomes tested: exp2 ¼ 1 2 E 3 The proportion of aneuploid embryos that have a normal/ balanced (negative) test result for the chromosomes tested (see Note 3): exp3 ¼ 2E(((K/2)/(1 2 J))((1 2 E)/2))2) + 2E 2((L/6)/ (1 2 J))((1 2 E)/2) + 4(((L/6)/(1 2 J)((1 2 E)/ 2))3) + 8E((((M/20)/(1 2 J))((1 2 E)/2))2) + 6E 2((M/20)/ (1 2 J))((1 2 E)/2) + 4E(((M/20)/(1 2 J)((1 2 E)/ 2))3) + 2(((M/20)/(1 2 J)((1 2 E)/2))4) + 2(((N/2)/ (1 2 J))((1 2 E)/2))3).

Downloaded from http://humrep.oxfordjournals.org/ by guest on December 31, 2015

For sex determination, it is recommended that the FISH test contain at least one probe specific for each centromere region of the X and Y chromosomes and one autosome to determine the ploidy of the nucleus, and diagnosis using only one biopsied cell with a clearly visible single nucleus is acceptable using FISH (Thornhill et al., 2005). The general principle and recommended practice for all chromosome rearrangements are that the test should include sufficient probes to detect all the unbalanced segregation products of the rearrangement and, if testing only one cell, to have two probes that are diagnostic for the chromosome imbalance associated with segregation products that are likely to be frequent or have the potential to be viable (Thornhill et al., 2005), thereby requiring two scoring errors to misdiagnose the unbalanced products as having a normal or balanced chromosome complement, and typically six signals have to be scored accurately to diagnose a normal/balanced translocation chromosome complement. The recommended practice for preimplatation genetic screening (PGS) is to test at least five chromosomes from those commonly associated with spontaneous miscarriage or potential to be viable (Thornhill et al., 2005). Typically, PGS tests use one probe for each chromosome; therefore at least 10 signals have to be scored accurately to diagnose a normal chromosome complement. Mathematical models were constructed for three different applications of PGT: (i) testing embryos for sporadic chromosome aneuploidy to improve reproductive efficiency (PGS) for any number of chromosome pairs greater than two (see Note 1); (ii) preimplantation genetic diagnosis (PGD) for sex-linked diseases using sex determination (see Note 2) and transfer of female embryos (PGD–SD), testing two pairs of chromosomes (e.g. the sex chromosomes and an autosome) and (iii) PGD of chromosome imbalance associated with two-way reciprocal translocations (PGD –RT) testing two pairs of chromosomes (a reciprocal exchange of terminal segments between chromosome A and chromosome B; see Note 3). Random sampling of a single cell from a genetically uniform embryo (no mosaicism) is assumed. The input values were the degree of aneuploidy and the accuracy per probe, defined as the proportion of normal diploid nuclei with a normal signal pattern. It was assumed that each probe had the same binding and scoring efficiency, and for the PGS and PGD– SD models, that each chromosome pair tested had the same degree of aneuploidy.

The primary statistics used to compare the different applications of PGT were the positive predictive value (the probability that an abnormal test result is correct) and the negative predictive value (the probability that a normal test result is correct). Other statistics calculated were: proportion of true positives, true negatives, false positives, false negatives (all using the test perspective and calculated as the proportion of the total outcomes), overall accuracy (the proportion of all test results that are correct), sensitivity (the probability that a nontransferable genotype has an abnormal test result), specificity (the probability that a transferable genotype has a normal test result), likelihood ratio of a positive result (how much more likely it is to get an abnormal test result in the male or aneuploid group than in the transferable genotype group), negative likelihood ratio (how much more likely it is to get a normal test result in the male or aneuploid group than in the transferable genotype group) and diagnostic odds ratio (an overall measure of the power of the test to discriminate between non-transferable and transferable genotypes). Details of the calculations are given in Boxes 1 and 2 (see Supplementary data for flexible versions of the models and plots).

2624

Scriven and Bossuyt

Box 2. Diagnostic accuracy calculations Outcome

.............................................................. Test

Abnormal

Normal

Total

b (exp2(1 2 exp1))

c

Normal

a (exp1 2 (exp1.exp3)) d (exp1.exp3)

f

Total

g (exp1)

e ((1 2 exp1) 2 exp2(1 2 exp1)) h (1 2 exp1)

Abnormal

i

........................................................................................... True positive

a/i

True negative

e/i

False positive

b/i (see Note 4)

False negative

d/i (see Note 4)

Accuracy

(a + e)/i

Sensitivity

a/g

Specificity

e/h

Positive likelihood

(a/g)/(b/h))

Negative likelihood

(d/g))/(e/h)

Diagnostic odds ratio

(a e)/(d b)

Positive predictive value

a/c

Negative predictive value

e/f

An initial input value of 95% accuracy per probe was used for all the models. For the PGS model, initial input values of five chromosome pairs tested and 3% aneuploidy per chromosome were used, based on the recommended practice of testing at least five chromosomes

(Thornhill et al., 2005), and the aneuploidy frequencies in male and female gametes over 37 years of age (Shi and Martin, 2000; Pellestor et al., 2003). For the sex-determination model, a 1:1 gamete sex ratio and 1% aneuploidy per chromosome were assumed. For the reciprocal translocation model, 32 segregation products were considered (Scriven et al., 1998): 43.7% alternate segregation (2 products), 30.7% adjacent-1 segregation (2 products), 10.9% adjacent-2 segregation (6 products), 14.2% 3:1 segregation (20 products) and 0.5% 4:0 segregation (2 products); the proportions were based on pooled male and female reciprocal translocation segregation data from cleavage-stage embryos (Mackie Ogilvie and Scriven, 2002 and unpublished data). In a sensitivity analysis for PGS, we varied the number of chromosomes tested, the accuracy per probe and the chromosome aneuploidy rate. The resulting diagnostic accuracy was compared with that of the PGD– SD and PGD–RT models.

Results Table I presents the model analytical performance statistics for the different applications of PGT. The PGD– SD model predicted that per 1000 embryos tested, fewer than one male or aneuploid embryo would be incorrectly diagnosed to have a normal female chromosome complement (0.1% of male or aneuploid embryos) and 48 embryos with a normal female complement would be incorrectly diagnosed to have a male or aneuploid chromosome complement (9.8% of normal female embryos). The test had 95.2% overall accuracy (95% CI: 93.8 –96.5%), 99.9% sensitivity (95% CI: 99.6 –100%), 90.3% specificity (95% CI: 87.6 –92.9%), 91.4% positive predictive value (95% CI: 89.1– 93.7%) and 99.9% negative predictive value (95% CI: 99.6– 100%). The positive and negative likelihood ratios were 10.2 (95% CI: 9.4–11.2) and 0.001 (95% CI: 0.001– 0.003), respectively, and the diagnostic odds ratio was 8057 (95% CI: 1481–14 639). The PGD–RT model predicted that per 1000 embryos tested, fewer than two unbalanced embryos would be incorrectly diagnosed to have a normal or balanced chromosome complement (0.3% of unbalanced embryos) and 62 embryos with a normal/balanced complement would be diagnosed incorrectly to have an unbalanced chromosome complement (14.3% of normal/balanced embryos). The test had 93.6% overall accuracy (95% CI: 92.1– 95.1%), 99.7% sensitivity (95% CI: 99.2 –100%), 85.7% specificity (95% CI: 82.5 – 89%), 90% positive predictive value (95% CI: 87.7– 92.4%) and 99.5% negative predictive value (95% CI: 99–100%). The positive and negative likelihood ratios were seven (95% CI: 6.5 –7.5) and 0.004 (95% CI: 0.002–0.004), respectively, and the diagnostic odds ratio was 1842 (95% CI: 983–2702). The PGS model (testing five chromosome pairs, 3% aneuploidy per chromosome equivalent to 14.1% prevalence), predicted that per 1000 embryos tested, fewer than seven unbalanced embryos would be incorrectly diagnosed to have a normal chromosome complement (4.7% of unbalanced embryos) and 194 embryos with a normal chromosome complement would be diagnosed incorrectly to have an aneuploid chromosome complement (22.6% of normal embryos). The test had 79.9% accuracy (95% CI: 77.4 –82.4%), 95.3% sensitivity (95% CI: 91.8 –98.8%), 77.4% specificity (95% CI: 74.6 –80.2%), 99.0% negative predictive value (95% CI: 98.0 –100%)

Downloaded from http://humrep.oxfordjournals.org/ by guest on December 31, 2015

Sporadic chromosome aneuploidy (PGS) The proportion of embryos with aneuploidy for at least one of the chromosome pairs tested: exp1 ¼ 1 2 (1 2 A)n The proportion of normal embryos that have an abnormal (positive) test result for the chromosomes tested: exp2 ¼ 1 2 E n The proportion of embryos with an abnormal chromosome complement that have a normal (negative) test result for the chromosomes tested (see Note 1): exp3 ¼ {An(1 2 E)(1 2 A)n2 1 + (1 2 E)2[A 2!n(1 2 A)n2 2/ 2!(n 2 2)] + (1 2 E)3[A 3!n(1 2 A)n2 3/6!(n 2 3)]}/1 2 (1 2 A)n Output statistics for the chromosomes tested Specificity: the proportion of embryos with a normal/balanced/ female chromosome complement that have a correct normal test result ¼ 1 2 exp2 Sensitivity: the proportion of embryos with an abnormal/male chromosome complement that have a correct abnormal test result ¼ 1 2 exp3 Positive predictive accuracy: the proportion of abnormal/male (positive) test results that are correct: ¼ [exp1 2 (exp1.exp3)]/[(exp1 2 (exp1.exp3)) + exp2(1 2 exp1)]. Negative predictive accuracy: the proportion of normal/ balanced/female (negative) test results that are correct: ¼ [(1 2 exp1) 2 exp2(1 2 exp1)]/[(1 2 exp1) 2 (exp2(1 2 exp1) + exp1.exp3)].

2625

0.504

0.900

0.996

0.864

0.900

0.747

0.004

2189

0.054

27

0.888

8.898 1.420

0.996 0.984

0.307

0.002

0.943

0.014

0.891

0.441

0.056

0.042

0.094

0.502 0.849

SD, sex determination; RT, reciprocal translocation; n, number of chromosome pairs tested; A, aneuploidy rate per chromosome pair tested; E, accuracy per probe.

0.896 0.924 0.999 Negative predictive value

0.995

0.990

0.999

0.586

0.504

0.900

0.900

0.678

0.409 0.914 Positive predictive value

0.900

0.510 Prevalence

0.563

0.141

0.141

12

0.114 0.003

15 741

0.039

110

0.061

69

0.004

1842

0.001

8057

Negative likelihood

Diagnostic odds ratio

1.393 4.287 Positive likelihood

10.245

6.989

4.213

55.007

0.307

0.965

0.774

0.982

0.970

0.774 0.903 Specificity

0.857

0.999 Sensitivity

0.997

0.953

0.997

0.639

0.018 0.000

0.984

0.020

0.907

0.007

0.799

0.002

Overall accuracy

0.936

0.001

0.952

False negatives

0.344

0.153 0.843

0.016

0.249

0.073

0.664

0.194

0.375

0.062

0.442

0.048

True negatives

False positives

0.486 0.141 0.658 0.135 0.561 0.509 True positives

..........................................................................................................................................................................................................................................................

PGS: n 5 23, A 5 3%, E 5 99.485% PGS: n 5 23, A 5 8.3%, E 5 95% PGS: n 5 23, A 5 3%, E 5 95% PGS: n 5 5, A 5 3%, E 5 99.635% PGS: n 5 5, A 5 20.3%, E 5 95% PGS: n 5 5, A 5 3%, E 5 95% PGD –RT, n 5 2, E 5 95% PGD – SD: n 5 2, A 5 1%, E 5 95%

........................................................................................................................................................................................................................

Application Measure

Figure 1 Plot showing the effect of the accuracy per probe on the negative predictive value of the test.

and 40.9% positive predictive accuracy (95% CI: 35.6 –46.2%). The positive and negative likelihood ratios were 4.2 (95% CI: 4– 4.4) and 0.06 (95% CI: 0.05 –0.08), respectively, and the diagnostic odds ratio was 69 (95% CI: 52 –87). However, a 90% positive predictive value would be achieved under similar conditions but with an aneuploidy rate per chromosome of 20.3% (99.5% prevalence for 23 chromosome pairs) instead of 3%, or an accuracy per probe of 99.6% instead of 95%. Assuming 3% aneuploidy per chromosome and changing the number of chromosomes tested to 23 pairs (50.4% prevalence), the sensitivity and specificity were 96.5% (95% CI: 94.9 –98.1%) and 30.7% (95% CI: 26.7 –34.8%), and the positive and negative predictive values were 58.6% (95% CI: 55.2– 61.9%) and 89.6% (95% CI: 87.5 – 91.7%). Changing the aneuploidy rate per chromosome to 8.3% (86.4% prevalence) or the accuracy per probe to 99.5% increased the positive predictive accuracy to 90%. Figures 1 and 2 show the different PGT applications and the effect of probe efficiency on the predictive values; the plots show that the performance of the PGD– SD and PGD –RT applications are more robust than PGS because the predictive value is less affected by reduced probe efficiency.

Discussion The objective of this study was to model and compare the accuracy of the FISH technique in testing a single nucleus from a preimplantation embryo without the complicating effect of mosaicism. Mathematical models were constructed for three different applications of FISH in PGT (sex determination for sex-linked diseases, two-way reciprocal translocations and sporadic chromosome aneuploidy). The input values were the degree of aneuploidy (initially set at 3% per chromosome for sporadic aneuploidy) and the accuracy per probe (initially set at 95%), defined as the proportion of normal diploid nuclei with a normal signal pattern. The primary statistic was the predictive value of the test result. Testing two chromosome pairs to determine sex chromosome status or detect unbalanced translocation products had high predictive value: at least 99.5% for a normal test result (95% CI: 99 –100%), and

Downloaded from http://humrep.oxfordjournals.org/ by guest on December 31, 2015

Table I Diagnostic measures for the different applications of PGT.

Diagnostic accuracy of single cell FISH

2626

90% for an abnormal test result (95% CI: 88–92%). However, the predictive value of an abnormal test result testing five chromosomes for sporadic chromosome aneuploidy was 41% (95% CI: 36–46%); 90% would be achieved with an aneuploidy rate per chromosome of 20.3% (equivalent to 99.5% prevalence for 23 chromosomes) rather than 3%, or with an accuracy per probe of 99.6% rather than 95%, and when testing 23 chromosome pairs, rather than 5 pairs, with either 8.3% aneuploidy (86.4% prevalence) instead of 3% or 99.5% accuracy per probe instead of 95%. Measuring the diagnostic accuracy of PGT in practice is complicated by the nature of the early human embryo and confounded by incomplete results (most embryos transferred fail to implant) in a conventional accuracy study, comparing tests results with a clinical reference standard. In practice, the most appropriate measure of diagnostic accuracy is likely to be the positive predictive value because all the abnormal test results have the possibility of being available for confirmation of diagnosis studies. The models presented in this study investigate the effect of changing parameters that are accessible in practice (i.e. the accuracy of scoring FISH signals in diploid cells, and the amount of chromosome aneuploidy). The FISH technique has limitations particularly when only one cell is available for testing. However, targeted testing of two chromosome pairs for sex determination means that only four FISH signals have to be scored accurately to diagnose a normal female chromosome complement, and in order to mis-score a normal male chromosome complement as a normal female chromosome complement two errors are required: failure to detect the Y chromosome signal and inaccurately scoring one X signal as two signals. Applying this strategy to testing a single nucleus means that for 98% of non-transferable products (male and/or aneuploid) and 50% of the total products, at least two scoring errors are required for it to be misdiagnosed to have a normal female chromosome complement. Similarly for two-way reciprocal translocations involving two pairs of chromosomes, six signals have to be scored accurately to diagnose a normal/balanced chromosome complement and for 86% of unbalanced products and 48% of the total products, at least two scoring errors are required for it to be misdiagnosed as a normal/balanced

chromosome complement. As a consequence for both the PGD– SD and PGD– RT models, the predictive value of a normal test result was very high (.99%) and the predictive value of an abnormal test result was acceptable (.80%) even when the probe accuracy was as low as 90%. The PGS test typically uses one probe for each chromosome; when testing for five chromosomes, 10 signals have to be scored accurately to diagnose a normal chromosome complement. Assuming 3% aneuploidy per chromosome, 14% of embryos are expected to have aneuploidy for at least one of the chromosomes tested (a relatively low prevalence compared with PGD–SD and PGD–RT) and only 6% of those have aneuploidy for more than one chromosome and therefore require at least two scoring errors for it to be misdiagnosed as a normal complement. As a consequence, poor probe accuracy had a more adverse effect on the diagnostic performance of PGS than PGD– SD or PGD–RT. The combination of the limitations of the FISH technique and the relatively low prevalence of chromosome aneuploidy per chromosome pair tested resulted in unacceptably low accuracy, and low positive predictive values (,50%, lower than flipping a coin). Our PGS model also allowed us to change the number of chromosome pairs tested. Given the limited number of different fluorochromes available and the diminishing efficiency of re-hybridizing the same nucleus multiple times, it is not practicable to test all 23 chromosome pairs using the conventional FISH technique or to achieve 99.5% accuracy per probe. Emerging techniques like comparative genomic hybridization are able to test for the copy number of every chromosome (Wilton et al., 2001; Wells et al., 2008), but it remains uncertain if the accuracy required will be achieved. The use of single nucleotide polymorphisms arrays and quantitative analysis (Wells et al., 2008) or ‘karyomapping’ genotyping (Handyside et al., 2009) are promising alternative techniques to detect chromosome aneuploidy and are claimed to have the potential for high accuracy. The challenge for PGT is to achieve excellent test performance for routine clinical use where only one or two cells are available for testing. Perfect diagnostic accuracy will not be possible and there must be a balance of priorities between minimizing the risk of misdiagnosis and providing the couple with a good chance of a successful pregnancy. For couples with sex-linked diseases, who are typically fertile and do not need assisted conception as such, the priority must be to minimize the risk of an affected viable pregnancy following testing (near-perfect test sensitivity and negative predictive accuracy), with some wastage of unaffected embryos a secondary concern, but the test should still have acceptable specificity and positive predictive value to produce a realistic chance of success. PGS couples typically need assisted conception and achieving a pregnancy should be the priority, which means correctly identifying normal embryos as well as excluding abnormal embryos; the treatment will be unsuccessful if there are none or too few embryos available for transfer. The priorities for chromosome rearrangements will be determined by a combination of the viability risk, obstetric history and the fertility status of the couple, and the range between translocations with a high risk of viable unbalanced products and a history of termination of pregnancy and live born children with chromosome imbalance, and those with a negligible viability risk and a history of infertility requiring ART. The FISH technique can be used to produce an effective diagnostic test for chromosome aneuploidy and routine clinical use. Relevant here is the multi-centre, blinded, controlled comparative prenatal

Downloaded from http://humrep.oxfordjournals.org/ by guest on December 31, 2015

Figure 2 Plot showing the effect of the accuracy per probe on the positive predictive value of the test.

Scriven and Bossuyt

2627

Diagnostic accuracy of single cell FISH

theoretical study provides insight into the fundamental relationship of prevalence to the accuracy of testing a single cell using the FISH technique. Given the inherent limitations of the method, PGS is a poor test because the prevalence of sporadic aneuploidy is too low. Reports of retrospective studies of diagnostic performance have been published by some groups (e.g. Munne et al., 1998; Silber et al., 2003; Michiels et al., 2006; DeUgarte et al., 2008). Typically these studies involve the re-analysis of embryos that were not transferred or cryopreserved by analysing the nuclei released following lysis of the whole embryo. The results obtained by this approach are therefore expected to be complicated by the inherent technical limitations of the FISH technique and cells that have multiple nuclei or nuclear fragments, polyploidy and chromosomal mosaicism, of which the significance to the clinical outcome of the embryo may be uncertain. It is also necessary to have a criterion about the proportion of nuclei with the same genotype (50% or greater is typical) to assign the outcome of each embryo; however, this is arbitrary and the cut-off affects the measures used to assess the effectiveness of the test. For example, in one study changing the cut-off from 50 to 80% changed the predictive value of an abnormal result from 83% (95% CI: 77 – 88%) to 88% (95% CI: 83 –92%) and a normal result from 81% (95% CI: 67– 92%) to 56% (95% CI: 40 –71%) (DeUgarte et al., 2008). In conclusion, despite the clear limitations of the FISH technique applied to testing a single cell, testing two chromosome pairs to determine the sex chromosome status or detect unbalanced products of translocations has the potential to acheive high accuracy, with the predictive value of a normal test result approaching 100% and of an abnormal result significantly .50%, and to achieve acceptable clinical test performance. However, using FISH to detect sporadic chromosome aneuploidy in preimplantation embryos will in most people’s hands result in a test with unacceptably low positive predictive value (,50%).

Notes Note 1 (for PGS) It is assumed that each chromosome pair segregates independently. The probability of aneuploidy for more than one chromosome is dependent on the aneuploidy rate; however, the test need only detect aneuploidy for at least one chromosome to be a true abnormal result.

Note 2 (for PGD – SD) It is assumed that each chromosome pair segregates independently. Eighteen permutations of sex chromosome and autosome are considered where, in addition to accurately scoring an XX complement where applicable, one (five products, e.g. XX,18), two (seven products, e.g. XXX,18), three (three products, e.g. XY,18) or four scoring errors (two products, e.g. XYY,18) could result in scoring an aneuploid female or male chromosome complement as a normal female chromosome complement.

Note 3 (for PGD – RT) It is assumed that a probe is used for each of the two translocated segments and one of the centric segments of chromosome A and B. In addition to accurately scoring a normal copy number where applicable, 30 unbalanced segregation products with different permutations

Downloaded from http://humrep.oxfordjournals.org/ by guest on December 31, 2015

diagnosis study of interphase FISH analysis and G-band karyotyping carried out using 1364 amniotic fluid samples from 31 laboratories tested with AneuVysion EC (Vysis, Inc., Downers Grove, IL, USA, 1997). Although the proportions of diploid nuclei from normal samples scored to have two signals for individual chromosomes tested ranged only from 94 to 97%, examining 50 interphase nuclei for each sample and using cut-off points of 60% of nuclei with an abnormal signal pattern for a true abnormal test result and 90% of nuclei with a normal signal pattern for a true normal test result resulted in perfect performance; the test sensitivity was 100% (95% CI: 99.4– 100%) and the specificity was 100% (95% CI: 99.1–100%). A retrospective case review of a further 5197 pregnancies tested with the AneuVysion assay, where 10.9% pregnancies had a detectable chromosome abnormality, found the sensitivity and specificity of the test to be 99.7% (95% CI: 99.2–100%) and 99.98% (95% CI: 99.9–100%), and the positive and negative predictive values to be 99.8% (95% CI: 99.5–100%) and 99.96% (95% CI: 99.8–100%), respectively, (Tepperberg et al., 2001). The priorities for prenatal diagnosis are clear, nearperfect sensitivity and specificity: it is essential to minimize the risk of a false normal test result (which would result in failure to detect an abnormal pregnancy and negate the purpose of the test), and also to minimize the risk of a false abnormal test result (which could lead to the termination of an unaffected pregnancy). Chromosomal mosaicism is rarely encountered at prenatal diagnosis (1–2%) compared with cleavage-stage embryos (30–56% for 4–9 chromosome pairs, and 67% for 23 chromosome pairs, Los et al., 2004). The models we have presented represent the PGD principle, which is that a cell sampled from an embryo is representative of the embryo; however, it is logical to conclude that in practice the diagnostic accuracy of PGT will be adversely affected by mosaicism, with a greater negative effect predicted testing 5 or more chromosome pairs (PGS) compared with 2 chromosome pairs (PGD–SD and PGD–RT). After more than a decade of clinical practice and tens of thousands of cycles, there has been no standardization or validation of PGT applications using FISH. In theory, excluding aneuploid embryos from transfer should increase the implantation rate and reduce the miscarriage rate; however, the practice of testing for sporadic chromosome aneuploidy has proved to be highly controversial. The Practice Committee of the Society for Assisted Reproductive Technology and the Practice Committee of the American Society for Reproductive Medicine (2008) concluded that the available evidence does not support the use of PGS as currently practised to improve live birth rates in patients with advanced maternal age, recurrent implantation failure or recurrent pregnancy loss, or to reduce the miscarriage rate in patients with recurrent aneuploid pregnancy loss. Multiple randomized controlled trials have emerged and each showed a negative effect of PGS; meta-analysis of trial data for the indication of advanced maternal age (Mastenbroek et al., 2008) showed a significant reduction of ongoing pregnancies (odds ratio 0.56, 95% CI 0.42–0.76). Criticism has highlighted the diversity of approaches to PGS in practice, including patient groups tested, criteria for the number of oocytes and embryos required, the stage of testing, biopsy and cell-spreading techniques, the number of cells to use, how many and which chromosomes to test, the use of secondary probes for ‘no result rescue’ or to confirm trisomy results (also subject to error), and how many embryos to transfer (Cohen et al., 2007; Colls et al., 2007; Munne et al., 2007; Cohen and Grifo, 2007; Simpson, 2008; Mir et al., 2010). We believe our

2628 of centric and translocated segments are considered, where one (8 products, e.g. A,A,der(A),B consistent with adjacent-2 segregation), two (10 products, e.g. A,der(A),B,B consistent with adjacent-1 segregation), three (10 products, e.g. A,der(A),der(A),B consistent with adjacent-2 segregation following a crossover in the interstitial segment of chromosome A) or four scoring errors (2 products, e.g. A,der(A),B,B,B consistent with 3:1 segregation following a crossover in the interstitial segment of chromosome B) could result in scoring an unbalanced product as normal/balanced.

Note 4 The test perspective is used and calculated as the proportion of the total outcomes and not as a proportion of the normal or abnormal outcomes.

Supplementary data References Bossuyt PM. Interpreting diagnostic test accuracy studies. Semin Hematol 2008;45:189 – 195. Cohen J, Grifo JA. Multicentre trial of preimplantation genetic screening reported in the New England Journal of Medicine: an in-depth look at the findings. Reprod Biomed Online 2007;15:365 – 366. Cohen J, Wells D, Munne S. Removal of 2 cells from cleavage stage embryos is likely to reduce the efficacy of chromosomal tests that are used to enhance implantation rates. Fertil Steril 2007;87:496 – 503. Colls P, Escudero T, Cekleniak N, Sadowy S, Cohen J, Munne S. Increased efficiency of preimplantation genetic diagnosis for infertility using “no result rescue”. Fertil Steril 2007;88:53– 61. DeUgarte CM, Li M, Surrey M, Danzer H, Hill D, DeCherney AH. Accuracy of FISH analysis in predicting chromosomal status in patients undergoing preimplantation genetic diagnosis. Fertil Steril 2008; 90:1049 – 1054. Goossens V, Harton G, Moutou C, Traeger-Synodinos J, Van Rij M, Harper JC. ESHRE PGD Consortium data collection IX: cycles from January to December 2006 with pregnancy follow-up to October 2007. Hum Reprod 2009;24:1786– 1810. Grimes DA, Schulz KF. Uses and abuses of screening tests. Lancet 2002; 359:881– 884. Handyside AH, Harton GL, Mariani B, Thornhill AR, Affara NA, Shaw MA, Griffin DK. Karyomapping: a universal method for genome wide analysis of genetic disease based on mapping crossovers between parental haplotypes. J Med Genet 2009; (October 25, epub ahead of print). Los FJ, Van Opstal D, van den Berg C. The development of cytogenetically normal, abnormal and mosaic embryos: a theoretical model. Hum Reprod Update 2004;10:79 – 94. Mackie Ogilvie C, Scriven PN. Meiotic outcomes in reciprocal translocation carriers ascertained in 3-day human embryos. Eur J Hum Genet 2002;10:801 – 806.

Mastenbroek S, Scriven P, Twisk M, Viville S, Van der Veen F, Repping S. What next for preimplantation genetic screening? More randomized controlled trials needed? Hum Reprod 2008;23:2626 – 2628. Michiels A, Van Assche E, Liebaers I, Van Steirteghem A, Staessen C. The analysis of one or two blastomeres for PGD using fluorescence in-situ hybridization. Hum Reprod 2006;21:2396 – 2402. Mir P, Rodrigo L, Mateu E, Peinado V, Mila´n M, Mercader A, Buendı´a P, Delgado A, Pellicer A, Remohı´ J et al. Improving FISH diagnosis for preimplantation genetic aneuploidy screening. Hum Reprod 2010; 24:1812 – 1817. Munne´ S, Cohen J. Chromosome abnormalities in human embryos. Hum Reprod Update 1998;4:842 – 855. Munne´ S, Magli C, Bahc¸e M, Fung J, Legator M, Morrison L, Cohert J, Gianaroli L. Preimplantation diagnosis of the aneuploidies most commonly found in spontaneous abortions and live births: XY, 13, 14, 15, 16, 18, 21, 22. Prenat Diagn 1998;18:1459 – 1466. Munne S, Gianaroli L, Tur-Kaspa I, Magli C, Sandalinas M, Grifo J, Cram D, Kahraman S, Verlinsky Y, Simpson JL. Substandard application of preimplantation genetic screening may interfere with its clinical success. Fertil Steril 2007;88:781 – 784. Pellestor F, Andre´o B, Arnal F, Humeau C, Demaille J. Maternal aging and chromosomal abnormalities: new data drawn from in vitro unfertilized human oocytes. Hum Genet 2003;112:195 – 203. Practice Committee of Society for Assisted Reproductive Technology; Practice Committee of American Society for Reproductive Medicine. Preimplantation genetic testing: a Practice Committee opinion. Fertil Steril 2008;90:S136 – S143. Scriven PN, Handyside AH, Ogilvie CM. Chromosome translocations: segregation modes and strategies for preimplantation genetic diagnosis. Prenat Diagn 1998;18:1437 – 1449. Shi Q, Martin RH. Aneuploidy in human sperm: a review of the frequency and distribution of aneuploidy, effects of donor age and lifestyle factors. Cytogenet Cell Genet 2000;90:219 – 226. Silber S, Escudero T, Lenahan K, Abdelhadi I, Kilani Z, Munne´ S. Chromosomal abnormalities in embryos derived from testicular sperm extraction. Fertil Steril 2003;79:30– 38. Simpson JL. Randomized clinical trial in assessing PGS: necessary but not sufficient. Hum Reprod 2008;23:2179 –2181. Tepperberg J, Pettenati MJ, Rao PN, Lese CM, Rita D, Wyandt H, Gersen S, White B, Schoonmaker MM. Prenatal diagnosis using interphase fluorescence in situ hybridization (FISH): 2-year multi-center retrospective study and review of the literature. Prenat Diagn 2001;21:293 – 301. Thornhill AR, deDie-Smulders CE, Geraedts JP, Harper JC, Harton GL, Lavery SA, Moutou C, Robinson MD, Schmutzler AG, Scriven PN et al. ESHRE PGD consortium ‘Best practice guidelines for clinical preimplantation genetic diagnosis (PGD) and preimplantation genetic screening (PGS)’. Hum Reprod 2005;20:35 – 48. Wells D, Alfarawati S, Fragouli E. Use of comprehensive chromosomal screening for embryo assessment: microarrays and CGH. Mol Hum Reprod 2008;14:703 – 710. Wilton L, Williamson R, McBain J, Edgar D, Voullaire L. Birth of a healthy infant after preimplantation confirmation of euploidy by comparative genomic hybridization. N Engl J Med 2001;345:1537– 1541.

Downloaded from http://humrep.oxfordjournals.org/ by guest on December 31, 2015

Supplementary data are available at http://humrep.oxfordjournals.org/.

Scriven and Bossuyt