Microsatellite markers derived from bay scallop ... - Wiley Online Library

1 downloads 0 Views 154KB Size Report
Jan 21, 2005 - accession number, repeat motif and expected size are sho wn ..... Takagi M, Sato J, Monbayashi C, Aoki K, Tsuji T, Hatanaka H,. Takahashi H ...
Blackwell Science, LtdOxford, UKFISFisheries Science0919 92682005 Blackwell Science Asia Pty LtdDecember 200571613411346Original ArticleEST SSR in bay scallopAB Zhan et al.

FISHERIES SCIENCE

2005; 71: 1341–1346

Microsatellite markers derived from bay scallop Argopecten irradians expressed sequence tags Ai-Bin ZHAN, Zhen Min BAO, Xiao Long WANG AND Jing Jie HU* Laboratory of Marine Genetics and Breeding, Division of Life Science and Biotechnology, Ocean University of China, Qingdao 266003, China

ABSTRACT: When data mining was performed on the National Center for Biotechnology Information database, a total of 2038 sequences from five different expressed sequence tag libraries were registered. Eighty sequences (3.9%) were found to contain 91 microsatellites. Clustering analysis indicated that 23 sequences of these expressed sequence tags fell into five clusters and that the remaining 57 sequences were independent. The di- and tri-nucleotide repeat motifs accounted for approximately 62.1% of the total microsatellites. The most abundant dinucleotide microsatellite was TA, followed by GA and CA, and the trinucleotide microsatellites GAT and GGT showed a high abundance. Nineteen sequences representing di-, tri-, tetra- and penta-nucleotides motifs were chosen for the design of polymerase chain reaction (PCR) primers. Of primer pairs, 16 successfully amplified scorable PCR products and 11 revealed polymorphism, with the average polymorphic information content value of 0.5082 and 3.1 alleles per locus. A transferability analysis on three other related scallop species, Chlamys farreri, Chlamys nobilis and Patinopecten yessoenssis, showed that only 1 of 16 primer pairs could amplify PCR products with the expected size in Chlamys nobilis. KEY WORDS: bay scallop, expressed sequence tags, markers, microsatellite, transferability.

INTRODUCTION The development of molecular markers that facilitate the analysis of genetic traits is important for crop and animal improvement.1 Microsatellites, also called simple sequence repeats (SSR), are small tandem repeated sequences (1–6 bp) that are widely dispersed in eukaryotic genomes.2 When primers, located in the flanking region of the repeat unit locus, are used with genomic DNA in polymerase chain reactions (PCR), they reveal length polymorphisms representing different alleles. Presently, microsatellites are one of the most useful genetic markers in many organisms because of the even distribution throughout the whole genome, and the high level of polymorphism and codominant Mendelian inheritance.3,4 Microsatellites have been used extensively for genetic diversity and population structure studies in many species.4 Also, microsatellites have been used in forensic

*Corresponding author: Tel: 86-532-82031970. Fax: 86-532-82031960. Email: [email protected] Received 21 January 2005. Accepted 21 June 2005.

studies, parentage assessment, evolutionary studies and construction of molecular linkage maps.5–8 The high cost of developing microsatellite markers limits their application because it is necessary to isolate, clone, sequence and characterize microsatellite loci in most species that are being examined for the first time. The identification of SSR markers in published sequence databases provides an alternative approach to developing microsatellites in a simple and direct way. Twenty highly polymorphic microsatellite markers were developed from the tiger puffer DNA database.4 Yue et al. identify 28 SSR from the genomic DNA and expressed sequence tag (EST) sequences of common carp deposited in GenBank.9 Bioinformatic analysis of 43 033 EST of channel catfish found 4855 EST containing microsatellites.10 Because EST-SSR are derived from cDNA sequences, they should be more informative than genomic SSR. Moreover, the growing number of available EST sequences in public databases makes the EST-SSR both abundant and easy to identify.10 Bay scallop Argopecten irradians was introduced from the USA for aquaculture in the 1980s and has become one of the most important cultured scallops in China. Breeding programs have been estab-

1342

FISHERIES SCIENCE

lished and the genetic diversity of different reared populations has been investigated.11 However, no information on microsatellite markers has been reported. To promote practical genetic analysis of and breeding programs involving the bay scallop, a large number of microsatellite markers are desirable. In the present study, we describe the identification of microsalliltes from bay scallop EST, and report our findings on the polymorphism of the EST-SSR.

AB Zhan et al.

reaction mixture contained approximately 20 ng of template DNA, 0.2 µM of each primer, 200 µM of each dNTP and 1 U GeneTaq (Takara) with 1 × PCR buffer in a total volume of 20 µL. The PCR conditions were programmed as one cycle of denaturation at 95°C for 5 min, followed by 35 cycles for 30 s at 95°C, 30 s at the annealing temperature of each primer pair and 30 s at 72°C, and a final step at 72°C for 5 min. The amplified PCR products were separated on a 10% non-denaturing polyacrylamide gel at 200 V for 1–2 h, stained with ethidium bromide and visualized under ultraviolet light.

MATERIALS AND METHODS Materials and DNA extraction

Data analysis

Thirty individuals of A. irradians were randomly collected from Huangdao Scallop Hatchery in Shangdong Province. For transferability analysis, 10 individuals of three species scallops, Chlamys farreri, Chlamys nobilis and Patinopecten yessoenssis, which is popularly cultured in China, were selected. The adductor muscles were removed from the live individuals and stored in liquid nitrogen until use. DNA was extracted according to the traditional phenol/chloroform extraction method.12

The polymorphic information content (PIC) value was estimated according to the following formula, n

PIC = 1 - Â f i2

(1)

i =1

where fi is the frequency of the ith allele and n is the allele number.13 PIC estimates the discriminatory power of a marker locus by measuring the number of alleles as well as allele frequency. RESULTS

Mining of the expressed sequence tag database Expressed sequence tag data of bay scallop obtained from the GenBank (http:// www.ncbi.nlm.nil.gov) database were analyzed with a homemade personal computer program to identify regions containing simple DNA repeats. The search was conducted for sequences that showed more than seven repetitions for di-, five repetitions for tri-, four repetitions for tetra-, and three repetitions for penta- and hexanucleotides. EST sequences that contained SSR were clustered using BioEdit Sequence Alignment Editor software (http://www.mbio.ncsu.edu/ BioEdit/bioedit.html). Sequences with the longest perfect repetitions and flanking regions were selected for PCR primer design. Primer design, polymerase chain reaction amplification and electrophoresis Polymerase chain reaction primers (forward and reverse) flanking the repeat sequence were designed using the computer program Primer Premier 5.0 (http://www.PremierBiosoft.com/ faq.html). PCR amplification was performed in a thermal cycler (GeneAmp PCR System 9700). The

Simple sequence repeats derived from expressed sequence tags The National Center for Biotechnology Information database contained 2038 EST for the bay scallop when the data mining analysis was performed. The EST sequences were derived from five different EST-libraries: the gonad, the adductor muscle, the whole body of spat, planktonic veliger larvae and mature pediveliger larvae. The downloaded data were investigated for all the 2–6 bp combinations of SSR units, and 80 sequences were found to contain 91 microsatellites. The sequences that carried SSR comprised 3.9% of the total published sequences. Cluster analysis grouped 23 sequences into five clusters. We were able to observe polymorphisms in the number of repeats in two clusters (Fig. 1). The remaining 57 sequences were independent. The repeat number of the 66 independent SSR loci ranged from 3 to 17. The longest repetition observed was 17 for an AT dinucleotide (accession number CN782724). EST-SSR contained a variety of repeat motif sequences, and di- and tri-nucleotide repeat motifs accounted for approximately 62.1% of the total microsatellites. The most abundant dinucleotide microsatellite was TA, appearing in 17 non-redundant sequences, followed by GA and CA. The trinucleotide microsatel-

EST-SSR in bay scallop

FISHERIES SCIENCE

1343

Fig. 1 Alignment analysis among homologous expressed sequence tags (EST) shows length polymorphisms for a bay scallop EST-simple sequence repeat loci.

lites GAT and GGT showed a high abundance, as has also been found in other species.10

only AIMS011 could amplify to the expected size for C. nobilis, and did not show polymorphism.

Polymerase chain reaction amplification and polymorphism

DISCUSSION

From the sequences containing microsatellites, 19 were chosen for the design of PCR primers (Table 1). The 19 sequences were chosen to represent di-, tri-, tetra- and penta-nucleotides motifs microsatellites, to test the relationship between the type of repeat motif and the level of polymorphism. When tested in PCR with genomic DNA, 16 pairs amplified the products with scorable size under the used conditions. AIMS005 failed to amplify any PCR product. The products amplified by AIMS010 and AIMS017 were too large to detect polymorphism. Out of the 16 functional primer pairs, 11 revealed length polymorphisms for 30 individuals analyzed (Table 2). The average number of alleles at polymorphic loci was n = 3.1, ranging from 2 to 6. The average PIC was 0.5082, ranging from 0.2128 to 0.7843. The locus AIMS011 has the most allele number (6) and the highest PIC (0.7843). Our results show that amplification efficiency and polymorphism did not appear to be influenced by repeat structure or repeat number. Transferability analysis To assess transferability, the 16 functional primer pairs were tested on three other related scallop species, C. farreri, C. nobilis and P. yessoenssis. Because these new microsatellite markers were developed from EST, presumably their DNA sequences can be conserved in related species and they can be transferable.9 Our results show that

Expressed sequence tag databases were searched for microsatellite loci in the bay scallop. It is more economical to search for EST-SSR than to use standard library screening, because the microsatellites are filtered computationally from the EST database to generate primers.10,14,15 In the present study, SSR were found to be relatively abundant (3.9% of total sequences) in scallop EST databases. The frequency of SSR in the scallop EST database is in accordance with that for plants, for example 2.5% in grapes and 6.5% in rye.14,15 However, it should be noted that a direct comparison of estimates of SSR frequencies in different reports is difficult because of the use of various repeat unit motif combinations, different minimal motif length criteria and redundant EST sequences in databases. A diverse range of SSR repeat motif sequences occurred in the published EST sequences. In the present study, we found that the (AT)n repeat is the predominant motif in the bay scallop EST. ATrepeats are found to be the most frequent repeats in yeast and plants.2 Yue et al. report that the most abundant repeats located in genes and EST were AT repeats in the common carp.9 The (AT)n motif is difficult to isolate using traditional hybridization methods because of its palindromic nature. Searching a database can identify AT-repeats. Of the 19 primer pairs tested in the present study, 16 primer pairs can amplify genomic DNA of bay scallops. Some of the amplified fragments were larger than the expected size, indicating the possible presence of introns within the genomic DNA sequence. The presence of long introns between

AIMS001 AIMS002 AIMS003 AIMS004 AIMS005 AIMS006 AIMS007 AIMS008 AIMS009 AIMS010 AIMS011 AIMS012 AIMS013 AIMS014 AIMS015 AIMS016 AIMS017 AIMS018 AIMS019

Locus

CF197476 CF197669 CK484134 CK484445 CK484488 CK484532 CK484160 CK484190 CK484242 CK484330 CV660848 CN783420 CN783245 CN782860 CN782724 CN782569 CN782359 CK484163 CN782436

Accession no.

Forward primer sequence (5′→3′) TTCCTAATGGTGCGGGCTAC GCCCAAAGCCATTCAAACCTC AGGCATTGAAGCAGAGGCTGAC CTGCAAACCATCATCTGTGAC CTGGAACCAATACTCAAGAAGTGTC GTGCAAAATTACGGAACACTACAC TGTCAGAGTTCACAGCTAGGTGACC CGGCAAGTGGACATATAATGTTCC CTTGTTGCACAAAAGCACAG AGGCATTGAAGCAGAGGCTGAC GACAGCAGAACAGTCAGTAGTTGTG GAGAGTACAAGCACTGTTCTCATG GAGAGTACTGAAGCGATCACTCTCA TACCTTACAAGATGTGACCGTCG GGAAAAAATGAACCCACCAAC GGACAAGTAGCGATTTATAGG GCATCTCCTGCTTCATCTAC ACTTCTCAACGATGTGTAAGAG CTCCACCTTCAGAACCATCC

Repeat (CAA)6 . . . (CAA)4 (CAC)6 (AT)11 (GAT)5(GAA)2(GAT)4 (CAAAA)3 (TTTA)4 (TCACA)3 (TATT)6 (GA)8 (TGG)5 (GAT)10 (TTAT)10 (TCA)6 (TA)8 . . . (AAAC)4 (AT)17 (GACG)7 (CTT)6 (ATATC)3 (GA)14A(AG)5

CATCATCGTACTCCTGGTTATC TTCACTGGTCTCGACTGCTGTG CATGTCATCATCCGACTCCTCG TGTCCTGAGGTGTGTCTTTTGTAG CCGCTACTTGTATCCATACCTTG GCACTCACTAGAAATCCAGAAAAG GGTTTCTCCTTGTTGTGGTCTGG CTCAAGAATGCACAGCAAACACAG GTTCCATACGAACTTCTATGTC CATGTCATCATCCGACTCCTCG GCACGTCTGCTTTCTCTGTATTAAC GGTGCTATATCGACCTATATCTGAG GACACAAATTAAAGGACGATGGAG TCATAATGAACAGCGACCGCAC GCTGCAACAACTACCATATAACCAA TTCAAGGCAAACGGTGACTCTG GACCGACTCAAGTTCTACGA GTGTACTTATCGGGACTCCTC CGAAAGAAAATATCAAGCACAC

Reverse primer sequence (5′→3′)

171 145 231 233 139 184 177 231 139 225 256 256 254 246 107 213 302 158 214

Length (bp)

Table 1 Primer sequences for detecting microsatellite loci developed from bay scallop Argopecten irradians EST. Microsatellite marker name, accession number, repeat motif and expected size are shown

1344

FISHERIES SCIENCE AB Zhan et al.

FISHERIES SCIENCE

EST-SSR in bay scallop

Table 2 Annealing temperature, allele size, allele number and polymorphic information content (PIC) value of 11 polymorphic expressed sequence tag simple sequence repeat markers for bay scallop Locus AIMS001 AIMS003 AIMS004 AIMS006 AIMS008 AIMS009 AIMS011 AIMS012 AIMS013 AIMS014 AIMS019

Ta (°C)

Allele size

Allele number

PIC

61 61 62 63 60 54 63 63 57 56 60

150–170 225–245 370–400 190–210 350–360 175–195 240–265 245–265 254–257 244–246 210–225

3 3 3 2 2 3 6 4 2 2 4

0.4703 0.6478 0.5184 0.4978 0.2128 0.4128 0.7843 0.6008 0.4965 0.3367 0.6115

the primers in the genomic DNA might explain the very large DNA fragment (>1.5 kb) amplified with AIMS010 and AIMS017. This is also one of the possibilities as to why there is no PCR product with AIMS005. The lack of amplification might also be because the primer sequences span across introns and/or contained mutations, and/or indels (insertion or deletion). The mutations and/or indels are found in the flanking regions of microsatellites in oysters.16 As described by Groben and Wricke, clustered sequences from a database can be useful for polymorphism identification and they provide additional sequence information for primer design.17 The scallop EST deposited in the GenBank database resulted in the polymorphism observed in cluster analysis. An example is shown in Figure 1, which has a variable number of GAT repeats. The primer pairs designed showed a high level of polymorphism at this microsatellite locus. Cross-amplification to a species related to the one from which the microsatellites are characterized is a desirable approach, as it does not require the high levels of labor and costs associated with direct development. Examples of microatellites successfully transferred to related species include: those used in the silver crucian carp but developed from common carp;9 and those used in marine shrimp Fenneropenaeus chinesis, Litopenaeus vannamei and Marsupenaeus japonicus with primers designed from Penaeus monodon.18 The primary limitation in the use of microsatellites from related species is that only a portion of microsatellites from another species will be informative. Zhang et al. report that 10% (3/30) of microsatellite markers could be common markers for F. chinesis, L. vannamei and M. japonicus.18 Because EST-SSR

1345

come from cDNA, we would expected these markers to be more transferable than those from non-gene regions. However, the transferability of microsatellites established here was low, only one to C. nobilis. The transfer efficiency of microsatellites might be determined by the genetic relatedness of the species and our results support that the three species tested in the present study are distantly related to bay scallop.19 Most of the microsatellites are type II markers developed from anonymous genomic sequences.20 Type I markers are associated with genes of known functions and are more useful for comparative genomic analysis.21 However, type I markers are difficult to develop and generally less polymorphic because of functional restraints of the gene sequences.9 Identification of polymorphic microsatellites from EST makes it possible to convert type II markers to type I markers.22 EST-SSR combine the advantages of microsatellite variability with the information potentially carried by expressed sequences. Once the microsatellites are mapped, they will likely yield information on the location and expression of the genes that carry them. This is an efficient and economical technique for accumulating and mapping cDNA sequences in different tissues and at different developmental stages and, therefore, for increasing the density of gene markers on linkage maps. In fact, many EST-SSR (approximately 25%) show homology to known genes when searched using tBLASTx. One example is the CN782436 EST, found in planktonic veliger larvae cDNA libraries, which carries the SSR (GA)14A(AG)5 and is highly homologous to Drosophila melanogaster CG5268-PA protein. Another example is the CK484160 EST, which is similar to Mus musculus AA987161 protein. SSR markers in known gene sequences also provide an opportunity to investigate the correlation between the repeat number and the functional aspects of the genes themselves. An economically significant phenotypic variation for grain quality has been associated with the expansion of a dinucleotide CT microsatellite in the 5′-untranslated region of the waxy gene in rice.23 The work on mapping and cloning of several different human hereditary disease genes has revealed that trinucleotide SSR sequences appear to be the root cause of these genetic abnormalities. In Kennedy’s disease, a (CAG)n repeat in the coding sequence of an androgen receptor gene increases from the normal copy number of n = 11–31 to n = 40–60. The Huntington’s disease gene has a (CAG)n repeat that increases from a normal copy number of n = 11–34 to n > 50 in cases of the disease.24 In conclusion, the EST-derived SSR have some obvious advantages, such as being easy to search

1346

FISHERIES SCIENCE

(by electronic filtering), abundant, unbiased in repeat type and present in gene-rich areas. The reported bay scallop EST might aid in the development of polymorphic SSR.

ACKNOWLEDGMENTS This research was mainly supported by grants from the ‘863’ Hi-Tech Research and Development Program of China (2003AA603022 and 2005AA603220) and the Natural Science Foundation of China (30300268).

REFERENCES 1. Gupta PK, Varshnet RK, Sharma PC, Ramesh B. Molecular markers and their applications in wheat breeding. Plant Breeding 1999; 118: 369–390. 2. Toth G, Gaspari Z, Jurka J. Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res. 2000; 10: 967–981. 3. Takagi M, Chow S, Okamura T, Scholey V, Nakazawa A, Margules D, Wexler JB, Taniguchi N. Mendelian inheritance and variation of four microsatellite DNA markers in the yellowfin tuna Thunnus albacares. Fish. Sci. 2003; 69: 1306– 1308. 4. Takagi M, Sato J, Monbayashi C, Aoki K, Tsuji T, Hatanaka H, Takahashi H, Sakai H. Evaluation of microsatellites identified in the tiger puffer Takifugu rubripes DNA database. Fish. Sci. 2003; 69: 1085–1095. 5. Bernatchez L, Duchesne P. Individual-based genotype analysis in studies of parentage and population assignment: how many loci, how many alleles? Can. J. Fish. Aquat. Sci. 2000; 57: 1–12. 6. Zhivotovsky LA, Feldman MW. Microsatellite variability and genetic distances. Proc. Natl. Acad. Sci. U.S.A. 1995; 92: 11549–11552. 7. Rico C, Rico I, Hewitt G. 470 million years of conservation of microsatellite loci among fish species. Proc. R. Soc. Lond. B Biol. Sci. 1996; 263: 549–557. 8. Waldbieser GC, Bosworth BG, Nonneman DJ, Wolters WR. A microsatellite-based genetic linkage map for channel catfish, Ictalurus punctatus. Genetics 2001; 158: 727– 734. 9. Yue GH, Ho MY, Orban L, Komen J. Microsatellites within genes and ESTs of common carp and their applicability in silver crucian carp. Aquaculture 2004; 234: 85–98. 10. Serapion J, Kucuktas H, Feng J, Liu Z. Bioinformatic mining of type I microsatellites from expressed sequence tags of channel catfish (Ictalurus punctatus). Mar. Biotechnol. 2004; 6: 364–377.

AB Zhan et al.

11. Zhang X, Liang Y, Liu R. Wang L, Yang B. The genetic diversity of reared populations of bay scallop Argopecten irradians (Lamarck). Acta Oceanol. Sinica. 2002; 24: 107–113. 12. Sambrook J, Fritsch EF, Maniatis T. Molecular Cloning: A Laboratory Manual. Cold Springs Harbour Laboratory Press, Cold Springs Harbour, New York. 1989. 13. Smith JSC, Chin ECL, Shu H, Smith OS, Wall SJ, Senior ML, Mitchell SE, Kresovich S, Ziegele J. An evaluation of the utility of SSR loci as molecular markers in maize (Zea mays L): comparisons with data from RFLPs and pedigrees. Theor. Appl. Genet. 1997; 95: 163–173. 14. Hackauf B, Wehling P. Identification of microsatellite polymorphism in an expressed portion of the rye genome. Plant Breeding 2002; 121: 17–25. 15. Scott KD, Eggler P, Seaton G, Rossetto M, Ablett EM, Lee LS, Henry RJ. Analysis of SSRs derived from grape ESTs. Theor. Appl. Genet. 2000; 100: 723–726. 16. McGoldrick DJ, Hedgecock D, English LJ, Baoprasertkul P, Ward RD. The transmission of microsatellite alleles in Australian and North American stocks of the Pacific oyster (Crassostrea gigas): selection and null alleles. J. Shellfish. Res. 2000; 19: 779–788. 17. Groben R, Wricke G. Occurrence of microsatellites in spinach sequences from computer database and development of polymorphic SSR markers. Plant Breeding 1998; 117: 271– 274. 18. Zhang T, Liu P, Meng X, Wang W, Kong J, Wang Q. Common microsatellite primers for molecular markers in different shrimp species. Hitech. Lett. 2003; 11: 80–85. 19. Chen S, Bao Z, Pan J, Hu J. Analysis of the population genetic diversity and species specific markers in four scallop species, Patinopecten yessoensis, Argopecten irradians, Chlamys nobilis and C. farreri. Acta Oceanol. Sinica. 2005; 27: 1–4. 20. Weber JL. Informativeness of human (dC-dA)n (dG-dT)n polymorphisms. Genomics 1990; 7: 524–530. 21. Vignal A, Milan D, SanCristobal M, Eggen A. A review on SNP and other types of molecular markers and their use in animal genetics. Genet. Sel. Evol. 2002; 34: 275–305. 22. Liu ZJ, Tan G, Li P, Dunham RA. Transcribed dinucleotide microsatellites and their associated genes from channel catfish Ictalurus punctatus. Biochem. Biophys. Res. Commun. 1999; 259: 190–194. 23. Ayers NM, McClung AM, Larkin PD, Bligh HFJ, Jones CA, Park WD. Microsatellites and a single nucleotide polymorphism differentiate apparent amylase classes in an extended pedigree of US rice germplasm. Theor. Appl. Genet. 1997; 94: 773–781. 24. Rubinsztein DC. Trinucleotide expansion mutations cause diseases which do not conform to classical Mendelian expectations. In: Goldstein DB, Schlotterer C (eds). Microsatellites: Evolution and Applications. Oxford University Press, New York. 1999; 80–97.