Deoxynucleotides can replace dideoxynucleotides in ... - BioTechniques

1 downloads 0 Views 292KB Size Report
Scott J. Tebbutt, Gareth D. Mercer, Ron Do, Ben W. Tripp, Alyson W.M. Wong, and Jian Ruan. BioTechniques ..... Gardner, A.F., C.M. Joyce, and W.E. Jack. 2004.
RESEARCH REPORT

Deoxynucleotides can replace dideoxynucleotides in minisequencing by arrayed primer extension Scott J. Tebbutt, Gareth D. Mercer, Ron Do, Ben W. Tripp, Alyson W.M. Wong, and Jian Ruan BioTechniques 40:331-338 (March 2006) doi 10.2144/000112111

Scientific literature describing arrayed primer extension and other array-based minisequencing technologies consistently cite the requirement for four fluorescent dideoxynucleotides (with concomitant absence/inactivation of deoxynucleotides) to ensure singlebase extension and thus sequence-specific intensity data that can be interpreted as a base call or genotype. We present compelling evidence that fluorescent deoxynucleotides can reliably be used in microarray minisequencing experiments, generating fluorescent sequence extension intensity profiles that are homologous to the single-base extensions obtained with terminator dideoxynucleotides. Due to the almost 10-fold higher costs (and limited fluorophore choice) of many commercially available fluorescent dideoxynucleotides, compared to fluorescent deoxynucleotides, as well as other potentially constraining intellectual property and licensing issues, this hitherto dismissed microarray chemistry represents an important reevaluation in the field of array-based genotyping and related enzymology.

INTRODUCTION The completion of the Human Genome Project was an important initial step in the exploration of human diversity. However, it is the present world-wide search for individual variation in the human genome that promises to elucidate how genetic variation interacts with the environment to confer individual resistance or susceptibility to disease, responsiveness to medical interventions, and drug toxicity (1). The most common form of genetic variation between individuals is single nucleotide polymorphisms (SNPs), which are single-base changes at specific DNA sites in the genome, occurring at a frequency of approximately one SNP every 200 to 1000 bp. Different combinations of SNPs in single or multiple genes interact with environmental factors to determine risk for disease as well as variability in how individuals respond to illness and medical therapy and whether they develop adverse drug responses. Research directed toward discovering gene-to-gene and gene-

to-environment interaction in disease causation and clinical outcome is increasing at an exponential rate, and pharmacogenomics is often quoted as being poised for application to health care as “personalized medicine” (1,2). Of the many methods that have been developed for genotyping, those based on the use of microarrays offer the greatest potential for economic, patientspecific application due to their ability to simultaneously interrogate multiple genetic markers (SNPs) using genetic material (template) amplified from an individual using PCR. Genotyping microarrays are devices displaying specific oligonucleotide probes (small lengths of synthetic DNA molecules), precisely located on a small-format solid support such as a glass slide. Although a number of different microarray genotyping chemistries exist (3– 5), here we are concerned with arrayed primer extension (APEX) (6). APEX is a minisequencing (7) method based on a two-dimensional array of oligonucleotides, immobilized via their 5′ ends on a glass surface. The classical APEX probe oligonucleotides (from 15- to

25-mers) are designed so that they are complementary to the gene up to, but not including, the base where the SNP exists, although allele-specific (AS)APEX oligonucleotide probes (where the 3′ base is the complement of the allelic site) can also be used (8). The Sanger-based sequencing chemistry of APEX allows for genotyping of hundreds to a few thousand SNPs, with the array chemistry taking just minutes. APEX achieves this clinically relevant speed because it uses the catalytic ability of a DNA polymerase enzyme to carry out a single nucleotide base extension at the 3′ end of the arrayed oligonucleotide probes, specific to the SNP sites of interest in template DNA that is temporarily hybridized to these probes. Dideoxynucleotide “terminator” bases are prelabeled with fluorophores specific to each one of the four bases of DNA (A, C, G, T). Thus, the fluorescent “color” (wavelength of emitted light) at each of the probe sites (array spots) will give SNP-specific genotypic information. Scientific literature describing APEX-based and other array-based

University of British Columbia, Vancouver, BC, Canada Vol. 40, No. 3 (2006)

BioTechniques 331

RESEARCH REPORT Table 1. Comparative Analysis of Genotyping Call Rate and Accuracy Using ddNTPs or dNTPs 87 SNPs (incl. Nonvalidated SNPs)

78 SNPs (Validated)

ddNTPs (%)

dNTPs (%)

ddNTPs (%)

dNTPs (%)

Ratea,b

99.9

99.8

99.9

99.9

Accuracy Ratec

N.A.

N.A.

99.8

99.0

Heterozygote Accuracy Rated

N.A.

N.A.

>99.9

98.3

Call

SNP, single nucleotide polymorphism; N.A., not applicable. aThe call rate was determined by calculating the percentage of actual genotype calls made relative to the total number of calls possible [87 SNPs × 13 samples or 78 SNPs × 13 samples (i.e., some calls would be “NN” for non-calls)]. bThe call rate for the validated genotype data set was 97.6% [990/1014 (24 × NNs)]. cThe accuracy rate determined by the number of genotypes called that were exactly the same as those in the validated data set [total of 990 validated genotypes]. dThe heterozygote accuracy rate determined by the number of called heterozygote genotypes that were exactly the same as the heterozygote genotypes in the validated data set [total of 345 validated heterozygotes].

minisequencing technologies (7) consistently cite the requirement for four fluorescent dideoxynucleotides (with concomitant absence/inactivation of deoxynucleotides) to ensure singlebase extension and thus sequencespecific intensity data that can be interpreted as a base call or genotype (3,6). We initially thought that, based on this established dogma of the use of ddNTPs, we could use all four fluorescent dNTPs (instead of ddNTPs) but incorporate them as multiple extensions from each of the oligonucleotide probes because dNTPs would not terminate the DNA polymerization after the initial base extension. For the classical APEX probes, we envisaged that color/base specificity would be lost due to the multiple extensions from the 3′ end of the probe. Thus, these types of probes would no longer be useful for genotyping. However, we predicted that specificity of extension would still be obtained from the allelespecific APEX probes (if an allele was present, the appropriate probe would undergo multiple extensions, rather than just a single extension, and would actually give us greater sensitivity/ higher signal intensity). Overall, this design would allow for a reduction in the costs of genotyping, given the 10-fold higher prices of some ddNTPconjugated fluorophores compared with dNTP-conjugated fluorophores. We have discovered that whether one uses four fluorescent dNTPs or four

fluorescent ddNTPs does not matter to a large degree, and that one still obtains effective single-base extension using either nucleotide reagent type. MATERIALS AND METHODS Coriell DNA Samples We carried out microarray genotyping experiments for two sets of Coriell DNA samples (coriell.umdnj. edu) using a minisequencing APEX chip recently developed and validated in our laboratory (8). We initially genotyped a set of 12 African-American Coriell samples, generating preliminary data that showed unexpected results. We then genotyped a set of 13 Caucasian Coriell samples, using freshly printed and higher quality arrays for further testing. PCR Amplification and Fragmentation Multiplex PCR amplification was performed on Coriell genomic DNA samples as previously described (8). Briefly, each PCR was performed in a total volume of 25 μL containing 2.5 μL 10× PCR buffer [Tris-Cl, (NH4)2SO4, 15 mM MgCl2, pH 8.7], 3 mM MgCl2, 200 μM dNTPs without dTTP, 160 μM dTTP, 40 μM dUTP, 1.25 U HotStar Taq DNA polymerase (5 U/μL; Qiagen, Valencia, CA, USA), 1 μL 7.7

mM primer mixtures (8), and 10 ng genomic DNA. Incorporation of the dUTP allows for the amplified DNA to be enzymatically sheared by uracil N-glycosylase (UNG, InterScience, Troy, NY, USA) to produce a DNA size of about 100 bases, optimal for hybridization to the oligonucleotides on the microarray. PCRs were initiated by a 15-min polymerase activation step at 95°C and completed by a final 10min extension step at 72°C. The PCR cycles were as follows: 25 cycles of 30 s denaturation at 95°C, 30 s annealing at 60°C, and 50 s extension at 72°C; followed by another 10 cycles of 30 s at 95°C, 30 s at 58°C, and 50 s at 72°C. The multiplex PCR products were pooled for each individual Coriell sample and precipitated by adding 2.5 volumes of ice-cold 100% ethanol and 0.25 volumes of 10 M ammonium acetate solution. The mixture was centrifuged at 20,800× g at 4°C for 20 min. The supernatant was carefully removed, and the DNA pellet was washed with 400 μL of ice-cold 70% ethanol. The DNA pellet was then dissolved in 15 μL double-distilled water. The precipitated DNA was then fragmented, and unincorporated dNTPs were inactivated by digestion with 1 U UNG and 1 U shrimp alkaline phosphatase (SAP; Amersham Biosciences, Piscataway, NJ, USA) for 1 h at 37°C, followed by enzyme inactivation for 10 min at 95°C in a 20-μL reaction mixture containing 2 μL 10× digestion buffer [0.5 M Tris-HCl, 0.2 M (HN4)2SO4, pH 9.0]. Microarray Printing Arrays were generously printed for us at the Prostate Centre Microarray Facility (University of British Columbia, Vancouver, BC, Canada). Briefly, the APEX and AS-APEX probe oligonucleotides (50 pmol/μL in 150 mM sodium phosphate print buffer, pH 8.5) were printed to specific grid positions on CodeLink™ Activated Microarray Slides (Amersham Biosciences) following the manufacturer’s recommended protocols. The 5′ end of each probe oligonucleotide is amino-modified, allowing its covalent attachment to the slide’s preapplied Vol. 40, No. 3 (2006)

surface chemistry. Each grid consisted of duplicate spots of each of the six probes per SNP as well as multiple buffer-only spots and positive control spots. The latter comprised two types of positive controls: multiple combinations of self-extending control oligonucleotides designed to extend to one or more of the four DNA bases, A, C, G, and T [SeqN probes (6)]; and an oligonucleotide probe based on a plantspecific gene sequence [Npg1 (8)] that will extend by a single N base due to the presence of an exogenous complementary template oligonucleotide in the APEX reaction mixture. Following the printing of the arrays, the slides were incubated overnight at room temperature at 75% relative humidity to drive the covalent coupling reaction between the probes’ 5′ amino group and the CodeLink slide chemistry to completion. Blocking of the arrays was in 50 mM ethanolamine, 0.1 M Tris, pH 9.0, 0.1% SDS, at 50°C for 15 min, according to the manufacturer’s protocol. Microarray-Based Genotyping: Arrayed Primer Extension The APEX reaction was performed in a total volume of 50 μL by the addition of 17 μL fragmented DNA template, 1 μL of 2 pmol/μL Npg1-positive control template oligonucleotide, 1 μM of each fluorescently labeled dideoxynucleotide triphosphate (Texas Red-ddATP, Cy™3-ddCTP, Cy5ddGTP, R110-ddUTP; Perkin Elmer Life Sciences, Boston, MA, USA), 5 U Thermo Sequenase™ DNA polymerase (Amersham Biosciences) diluted in its dilution buffer, to 2× Thermo Sequenase reaction buffer (10×, 260 mM Tris-HCl, 65 mM MgCl2, pH 9.5). The reaction mixture was applied to the array of APEX and AS-APEX probes previously printed on the CodeLink slide that had been washed two times in 95°C double-distilled water and placed on a Thermo Hybaid HyPro100 incubation plate (Thermo Electron, Waltham, MA, USA) set at 58°C. The reaction mixture was covered with a small piece of Parafilm™, and the APEX reaction allowed to proceed at 58°C with agitation (setting 1) for 20 min. Following the incubation period, slides were washed with 95°C doubleVol. 40, No. 3 (2006)

distilled water to remove the template DNA, enzyme, and excess ddNTPs. Further washing in 0.3% Alconox (Alconox, Inc., White Plains, NY, USA) and 95°C double-distilled water ensured low background on the array images. For the duplicate APEX reactions using dNTPs instead of ddNTPs, we used 1 μM of each fluorescently labeled deoxynucleotide triphosphate (Texas Red-dATP, Cy3-dUTP, Cy5dCTP, fluorescein-dGTP). For the enzymology comparison experiments, we used 5 U Taq DNA polymerase, in its supplied reaction buffer (Qiagen, Venlo, The Netherlands). In preliminary experiments in which we diluted either the labeled ddNTPs or labeled dNTPs with unlabeled dNTPs, we used 0.5 μM of each fluorescently labeled nucleotide plus 0.5 μM of each unlabeled dNTP. Microarray Imaging and Spot Intensity Calculation Slide microarrays were imaged using an arrayWoRxe® Auto Biochip Reader (Applied Precision, LLC, Issaquah, WA, USA), fitted with the following filter sets: (i) A488, excitation 480/15×, emission 530/40 (R110 and fluorescein dyes); (ii) Cy3 (narrowband), excitation 546/11, emission HQ570/10m (Cy3); (iii) Texas Red, excitation 602/13, emission 631/23 (Texas Red); (iv) Cy5, excitation 635/20, emission 685/40 (Cy5). Exposure times for each dye were set up to give approximately 60%70% pixel saturation for selected Npg1positive control probes. Resolution of the imager was set to 10 μm. Four 16-bit TIFF files for each array were obtained (one from each channel), and these were analyzed using softWoRx® Tracker software V.2.23.02 (MolecularWare, Cambridge, MA, USA). Spot intensity values (“Cell” background method) and probe name/grid coordinates were exported to Microsoft® Excel® [the intensity values were normalized by setting up to 82 Npg1-positive control spots, widely distributed across each array to an average value of 20,000 U per channel, with the exported normalized intensity value calculated from the scale factor x (median signal minus median background)].

Data Management and Genotyping We have previously developed a novel tool, the “SNP Chart” application (9), which is a data management and visualization tool for array-based genotyping by primer extension from multiple probes (www.snpchart.ca). This software generates visual patterns of spot intensity values from multiple channels across a multiple probe set specific for a given SNP, allowing easy calling of the genotype. Excel files containing the softWoRx normalized spot intensity values and probe name/ grid coordinates for each Coriell sample were imported into SNP Chart, and the data for each of the SNPs across each Coriell sample were visualized. Direct comparisons were made between array data obtained using ddNTPs and dNTPs. Manual genotype calling was carried out for 87 SNPs, of which 78 SNPs had validated genotypes available to us from D. Reich, M. Freedman, and colleagues (Broad Institute, Cambridge, MA, USA), as well as from the SNP Consortium and the National Center for Biotechnology Information (NCBI) web sites (snp. cshl.org and www.ncbi.nih.gov/SNP, respectively). Supplementary Table S1 lists all genotyping data generated from our ddNTP and dNTP experiments, with yellow color-coding showing equivalent genotypes to the validated data, and red coding showing discrepant (and “NN”) calls. Noise-tosignal ratio calculations and all statistical analyses were performed using Microsoft Excel. RESULTS In this report, we present three figures and three tables. Additional figures and a table of genotype calls are included in the supplementary material that is available online at www.BioTechniques.com. Figure 1 shows SNP Chart data for specific control probe oligonucleotides that are represented as replica spot features on our arrays. These “SeqA,” “SeqC,” “SeqG,” and “SeqT” probes are based on positive control APEX sequences previously designed by Kurg et al. (6)

RESEARCH REPORT

Figure 1. Fluorescent signal profiles from SeqA, C, G, T control probes using either ddNTPs or dNTPs for arrayed primer extension. SNP Chart experimental data from two microarrays are shown. The Coriell sample used for both arrays was NA12251. Each chart shows four-channel data from duplicate spots (x-axis), with four charts shown for each microarray (for SeqA probes, SeqC, SeqG, and SeqT). The left-hand column shows results obtained using fluorescent ddNTPs, while the right-hand column results are from fluorescent dNTPs. Relative fluorescent intensities [normalized within each microarray across the four channels using Npg1-positive control spots (8)] are represented on the y-axis. The SeqG-ddNTP Spot 2 data are low due to localized experimental failure on this array. SNP, single nucleotide polymorphism. 334 BioTechniques

Vol. 40, No. 3 (2006)

Figure 2. Fluorescent signal profiles from rs1006130 GT SNP probes using either ddNTPs or dNTPs for arrayed primer extension. Selected SNP Chart data for the rs1006130 SNP from six microarrays are shown. Template DNA from three Coriell samples were used in duplicate arrays—one with ddNTPs in the APEX reaction (left-hand column) and one with dNTPs in the reaction (right-hand column). Each chart shows four-channel fluorescent intensity data (A,C,G, and T) from 12 rs1006130-specific array spots (duplicate spots for six different probes). Spots 1 and 2 (“LEFT G/T”) refer to the left-hand APEX probe (8) that will give either a single G (red) signal (for homozygous GG genotypes) or a T (blue) signal (for homozygous TT genotypes) or a mixture of G and T (heterozygous GT). Spots 3 and 4 (“RIGHT C/A”) refer to the right-hand APEX probe that interrogates the complementary DNA strand nucleotide to that of the left-hand APEX probe, thus giving a single C (green) signal (for GG), a single A (yellow) signal (for TT), or a mixed C and A signal (for GT). Spots 5 to 12, inclusive, represent allele-specific APEX probes in which a base-specific fluorescence signifies the presence of the allele. “_1” probes correspond to the first allele (G in the case of rs1006130) and “_2” probes correspond to the second allele (T). The redundancy and consistency of the data across different probes give high confidence in the assigned genotypes (8) and clearly illustrate the homologous nature of fluorescent dNTPs with respect to fluorescent ddNTPs in APEX. SNP, single nucleotide polymorphism; APEX, arrayed primer extension.

Vol. 40, No. 3 (2006)

BioTechniques 335

RESEARCH REPORT Table 2. Summary Analysis of Average Noise-to-Signal Ratio (%) for SeqA, C, G, T Control Probesa Average Noisec ddNTP data

dNTP data

Signalb

A

C

G

T

A

C

G

T

A channel

N.A.

8%

2%

4%

N.A.

1%

11%

17%

C channel

2%

N.A.

0%

2%

9%

N.A.

9%

1%

G channel

7%

1%

N.A.

2%

4%

1%

N.A.

2%

T channel

0%

16%

0%

N.A.

6%

0%

5%

N.A.

N.A., not applicable. aSeqA, C, G, T control probe spots were selected that theoretically would give only one signal—A, C, G, or T. bSignal was set at 100% for the intensity in the correct channel. cAverage noise (per channel) is defined by the % ratio of incorrect channel intensity (noise) to signal intensity averaged across 26 replicate spots, duplicate spots for each of 13 arrays.

to allow directed nucleotide extension based on secondary structure within the DNA sequence of the probe itself. For single-base extension using ddNTPs for example, the SeqA probe will always extend by a single ddATP molecule, while the SeqG probe will always extend by a single ddGTP. Figure 1 shows the actual fluorescent intensity output from these probe spots. The left-hand column of Figure 1 clearly shows base-specific intensities that relate to the original design of the probes; for example, SeqA spots show predominantly yellow-colored bars, corresponding to incorporation of ddATP. The right-hand column of Figure 1 shows equivalent experimental data obtained by replacing the ddNTPs with fluorescent dNTPs. Although the sequence design of these probes would be expected to allow for multiple base extensions across different nucleotides if a standard DNA polymerization reaction was taking place, the data clearly demonstrate the unexpected occurrence of effective single-base extension using dNTPs. Figure 2 shows SNP Chart data outputs easily interpreted as specific genotypes at the rs1006130 SNP for homozygous major allele, heterozygous, and homozygous minor allele Coriell individuals. Rs1006130 is a G/T SNP (based on the “LEFT” DNA strand). Coriell sample NA07037 is GG; NA06993 is GT; NA12251 is TT. As in Figure 1, the data clearly show homologous signal specificity regardless of whether ddNTPs or dNTPs are used in the APEX reaction. Further examples

for different SNPs are provided in the supplementary material. Table 1 shows a breakdown of genotyping analysis, based on manual calling of 87 SNPs represented on the APEX chip (8), across 13 Caucasian Coriell DNA samples. Call rates are comparable (>99%), although the use of fluorescent dNTPs results in a slight loss of accuracy compared with using fluorescent ddNTPs [99.0% versus 99.8% (98.3% versus >99.9% for heterozygote calling accuracy)].

However, it should be noted that the R110 dye used in the ddNTP experiments gives much lower background than the nonoptimized fluorescein dye used in the dNTP experiments. The use of more appropriate fluorophores should improve these respective accuracies. The entire genotype data set for 87 SNPs across the 13 Coriell samples for both ddNTP and dNTP experiments is shown in the supplementary material. Average noiseto-signal intensity ratios across the channels are shown for the SeqA, SeqC, SeqG, and SeqT probes (Table 2) and also for the combined APEX and AS-APEX probes (Table 3). Although the average noise-to-signal ratios are generally higher for the dNTP reagents compared with the ddNTP reagents, they are not consistently higher for all channels and certainly do not seem to be seriously detrimental to the genotyping call rate and accuracy. It could well be that some individual probe molecules do in fact undergo multiple base extensions with fluorescent dNTPs, but this may represent a minor fraction of the probe set for any one oligonucleotide. Our experimental design, incorporating

Figure 3. Preliminary enzymology data comparing the effect of using Thermo Sequenase or Taq DNA polymerase in arrayed primer extension assays with either ddNTPs or dNTPs. Equivalent sub-grids from four microarrays are shown. Pooled template DNA from Coriell sample NA17101 was used in all arrays. Identical reaction conditions were set up for two different thermostable DNA polymerases, each enzyme with its own recommended buffer [with either fluorescent ddNTPs in the APEX reaction (left-hand column) or fluorescent dNTPs in the reaction (right-hand column)]. APEX, arrayed primer extension. Vol. 40, No. 3 (2006)

Table 3. Summary Analysis of Average Noise-to-Signal Ratio (%) Between ddNTP and dNTP Groupsa ddNTP/[dNTP] Data-Noise Channelc Signal Channelb Average

A

C

G

T

Median

SD

Average

Median

SD

Average

Median

SD

Average

Median

SD

A

N.A.

N.A.

N.A.

12/[5]

8/[1]

12/[31]

6/[30]

3/[20]

8/[34]

9/[31]

5/[21]

10/[38]

C

3/[13]

1/[12]

4/[7]

N.A.

N.A.

N.A.

3/[34]

0/[21]

10/[89]

9/[18]

4/[7]

33/[38]

G

10/[6]

9/[4]

7/[6]

14/[5]

4/[1]

28/[10]

N.A.

N.A.

N.A.

12/[29]

6/[11]

17/[51]

T

5/[8]

1/[1]

43/[38]

17/[3]

15/[0]

11/[13]

3/[20]

0/[7]

12/[39]

N.A.

N.A.

N.A.

N.A., not applicable. aFor the 87 SNPs genotyped, arrayed primer extension (APEX) and allele-specific APEX probe spots were selected that would give only one signal, A, C, G, or T, depending on the Coriell DNA sample analyzed (i.e., no APEX probe spots were included that were associated with heterozygote Coriell genotypes and no allelespecific APEX probe spots were included that were associated with homozygous Coriell genotypes at the other allelic site). A lower intensity cut-off of 1000 U in the signal channel was used to remove low intensity spots. bSignal was set at 100% for the intensity in the correct channel. cAverage noise (per channel) is defined by the % ratio of incorrect channel intensity (noise) to signal intensity, averaged across multiple spots regardless of probe sequence context, and 13 arrays. Total numbers of spots averaged for the ddNTP data analysis were 868 (A signal channel), 944 (C), 921 (G), and 1252 (T). Total numbers of spots averaged for the dNTP data analysis were 831 (A signal channel), 734 (C), 1182 (G), and 1259 (T).

multiple probes for each SNP under investigation (classical APEX probes and AS-APEX probes for both DNA strands), gives good redundancy and robustness to our data, and this may also be helping our genotype calling despite the greater noise-to-signal intensity ratios sometimes found when fluorescent dNTPs are used. Figure 3 shows four-channel sub-grid images from preliminary experiments comparing the use of different DNA polymerases in minisequencing by APEX. While Taq DNA polymerase was not able to incorporate fluorescent ddNTPs, it is clear that this alternative (and cheaper) enzyme is able to incorporate fluorescent dNTPs to some extent. In fact, Taq DNA polymerase appears to allow similar base specificity of incorporation to that of Thermo Sequenase (qualitative color spot data are shown). DISCUSSION Given the tremendous research efforts directed toward developing better “terminator-type” fluorescent nucleotides as well as more effective DNA polymerases that can incorporate these modified nucleotides into nucleic acid chains (10–14), our demonstration of effective single-base extension by APEX simply using fluorescent Vol. 40, No. 3 (2006)

dNTPs is timely. The additional current expense of using ddNTPs, coupled with continued difficulties encountered by supply companies in trying to license some of the new Alexa-type fluorophore dyes conjugated to ddNTPs has meant that technologies such as APEX have not been able to fully utilize the many improvements and innovations in the fields of nucleic acid and fluorophore chemistries and enzymology. Utilizing dNTPs may allow for the use of alternative and cheaper polymerase enzymes than is currently the case with ddNTPs (e.g., Thermo Sequenase versus Taq DNA polymerase). Other methodologies that require singlebase extension might also benefit from using all four fluorescent dNTPs as reagents, such as single base extensiontag array on glass slides (SBE-TAGS) (3), genotyping by fluorescence polarization detection (11), and polymerase colony genotyping and haplotyping (15). An obvious question is why the dNTP-based extensions are effectively terminating when there is still a free -OH group on the ribose sugar moiety? We speculate that sterical hindrance by the nearby bulky fluorophore is one possible answer and has been stated previously by researchers looking at reversible terminators (12) rather than simple dNTPs. Preliminary results from experiments in which we dilute the fluorescent ddNTPs or the fluorescent

dNTPs with unlabeled dNTPs show an accumulation of multiple, nonspecific intensities from the probes, consistent with multiple base extensions, and thus evidence that steric hindrance could be the cause of the apparent single-base extension seen in our data when labeled dNTPs alone are used (see supplementary material). Both the choice of polymerase and length of extension reaction are likely important factors in optimizing the extent to which multiple extension signal will confuse the data because different polymerases may be sterically hindered to different extents, and more time might increase multiple extension events. We hope that the results shown here stimulate a reevaluation of the chemistries, enzymologies, and applications of the field of minisequencing on microarrays. ACKNOWLEDGMENTS

We would like to thank Dean English, Bruce Dangerfield, Igor Opushnyev, and Mohua Podder for technical assistance, Andrew Sandford and Colleen Nelson for helpful comments, and Peter Paré for continued support. We are grateful to Chroma Technology (Rockingham, VT, USA) for assistance with new dye filters for the array imaging. This research was supported by the National Sanitarium Association (Canada), the British BioTechniques 337

RESEARCH REPORT Columbia Lung Association, AllerGen NCE, the Canadian Institutes of Health Research, and the Michael Smith Foundation for Health Research. COMPETING INTERESTS STATEMENT

The authors declare no competing interests. REFERENCES 1. Lord, P.G. and T. Papoian. 2004. Genomics and drug toxicity. Science 306:575. 2. Hood, L., J.R. Heath, M.E. Phelps, and B. Lin. 2004. Systems biology and new technologies enable predictive and preventative medicine. Science 306:640-643. 3. Hirschhorn, J.N., P. Sklar, K. Lindblad-Toh, Y.M. Lim, M. Ruiz-Gutierrez, S. Bolk, B. Langhorst, S. Schaffner, et al. 2000. SBETAGS: an array-based method for efficient single-nucleotide polymorphism genotyping. Proc. Natl. Acad. Sci. USA 97:12164-12169. 4. Oliphant, A., D.L. Barker, J.R. Stuelpnagel, and M.S. Chee. 2002. BeadArray technology: enabling an accurate, cost-effective approach to high-throughput genotyping. BioTechniques (Suppl.):56-61. 5. Kennedy, G.C., H. Matsuzaki, S. Dong, W.M. Liu, J. Huang, G. Liu, X. Su, M. Cao, et al. 2003. Large-scale genotyping of complex DNA. Nat. Biotechnol. 21:1233-1237. 6. Kurg, A., N. Tonisson, I. Georgiou, J. Shumaker, J. Tollett, and A. Metspalu. 2000. Arrayed primer extension: solid-phase four-color DNA resequencing and mutation detection technology. Genet. Test. 4:1-7. 7. Syvanen, A.C. 1999. From gels to chips: “minisequencing” primer extension for analysis of point mutations and single nucleotide polymorphisms. Hum. Mutat. 13:1-10. 8. Tebbutt, S.J., J.Q. He, K.M. Burkett, J. Ruan, I.V. Opushnyev, B.W. Tripp, J.A. Zeznik, C.O. Abara, et al. 2004. Microarray genotyping resource to determine population stratification in genetic association studies of complex disease. BioTechniques 37:977-985. 9. Tebbutt, S.J., I.V. Opushnyev, B.W. Tripp, A.M. Kassamali, W.L. Alexander, and M.I. Andersen. 2005. SNP Chart: an integrated platform for visualization and interpretation of microarray genotyping data. Bioinformatics 21:124-127. 10. Tabor, S. and C.C. Richardson. 1995. A single residue in DNA polymerases of the Escherichia coli DNA polymerase I family is critical for distinguishing between deoxy- and dideoxyribonucleotides. Proc. Natl. Acad. Sci. USA 92:6339-6343. 11. Xiao, M., A. Phong, K.L. Lum, R.A. Greene, P.R. Buzby, and P.Y. Kwok. 2004. Role of excess inorganic pyrophosphate in primer-extension genotyping assays. Genome Res. 14:1749-1755. 12. Shendure, J., R.D. Mitra, C. Varma, and G.M. Church. 2004. Advanced sequencing technologies: methods and goals. Nat. Rev. Genet. 5:335-344. 13. Ghadessy, F.J., N. Ramsay, F. Boudsocq, D. Loakes, A. Brown, S. Iwai, A. Vaisman, R. Woodgate, and P. Holliger. 2004. Generic 338 BioTechniques

expansion of the substrate spectrum of a DNA polymerase by directed evolution. Nat. Biotechnol. 22:755-759. 14. Gardner, A.F., C.M. Joyce, and W.E. Jack. 2004. Comparative kinetics of nucleotide analog incorporation by vent DNA polymerase. J. Biol. Chem. 279:11834-11842. 15. Mitra, R.D., V.L. Butty, J. Shendure, B.R. Williams, D.E. Housman, and G.M. Church. 2003. Digital genotyping and haplotyping with polymerase colonies. Proc. Natl. Acad. Sci. USA 100:5926-5931.

Received 4 October 2005; accepted 21 November 2005. Address correspondence to: Scott J. Tebbutt James Hogg iCAPTURE Centre for Cardiovascular and Pulmonary Research St. Paul’s Hospital, University of British Columbia Vancouver, BC, V6Z 1Y6, Canada e-mail: [email protected]

To purchase reprints of this article, contact [email protected]

Vol. 40, No. 3 (2006)