Degree of Selective Constraint as an Explanation of the ... - NCBI

11 downloads 25099 Views 887KB Size Report
Degree of Selective Constraint as an Explanation of the Different Rates of. Evolution of Gender-Specific Mitochondrial DNA Lineages in the Mussel Mytilus.
Copyright 0 1996 by the Genetics Society of America

Degree of Selective Constraintas an Explanation of the Different Ratesof Evolution of Gender-Specific Mitochondrial DNA Lineages in the Mussel Mytilus Donald T. Stewart,* Ellen R. Kenchington,*’+Rama K. Singh’ and Eleftherios Z~uros*’~ *Department of Biology, Dalhousie University, Halifax, Nova Scotia B3H 4J1, Canada, tDepartment of Fisheries and Oceans, Halifax, Nova Scotia B3J 2S7, Canada, $NationalResearch Council Institute for MarineBiosciences, Halifax, Nova Scotia B?H l Z 1 , Canada and sDepartment of Biology, University of Crete and Institute of Marine Biology of Crete, Iraklion, Crete, Greece Manuscript received August 12, 1995 Accepted for publication April 9, 1996 ABSTRACT Mussels of the genusMytilus segregate for a maternally transmitted F lineage and a paternally transmitted M lineage of mitochondrial DNA. Previous studies demonstrated that these lineages are older than the species of the M . edulis complex and that the M lineage evolves faster than the F lineage. Here we show that the latter observation also applies to a region of the molecule with no assigned function. Sequence data for the mitochondrial COIII gene and the “unassigned” region of the F and M lineages of M. edulis and M. trossulus are used to evaluate various hypotheses that may account for the faster rate of evolution of the M lineage. Tests based on the proportion of synonymous and nonsynonymous substitutions suggest that the M lineage experiences relatively relaxed selection. Further supportfor this hypothesis comes from an examination of COIII amino acid substitutions at sites defined as either conserved or variable based on the pattern of variation in other mollusks and Drosophila. Most substitutions in the M lineage occur in regions that are also variable among non-Mytilus taxa. We suggest that these differences in selection pressure are a consequence of doubly uniparental mitochondrial DNA transmission in Mytilus.

T

HE mitochondrial DNA (mtDNA)ofmusselsof the genus Mytilus is characterized by a number of unusual features. The first to be described was a high frequency of mtDNA heteroplasmy (FISHER and SKIBINSKI 1990; HOEHet al. 1991) that was subsequently attrib uted to a peculiar mode of mitochondrial DNA transmission (ZOUROS et al. 1992, 1994a,b; SKIBINSKI et al. 1994a,b). Female mussels normally possess one type of mtDNA that they transmit to both daughtersand sons. In contrast, males are heteroplasmic for two typesof mtDNA genomes, the female or “F” type that they receive from their mother but transmit to no offspring, and the male or “M” type that they receive from their father and transmit to sons. This systemofmtDNA transmission has been termed “doubly uniparental inheritance” or DUI (ZOUROSet al. 1994a). Provided that the M and F genomes do not recombine, DUI results in the formation of distinct male and female mtDNA lineages (see commentary by HURSTand HOEKSTRA 1994). RAWSON and HILBISH(1995) and STEWART et al. (1995) have indeed demonstrated the presence of gender-associated lineages in two species of mussels,M . edulis and M . trossulus, and shown that these lineages arose before the divergence of these two taxa from their common ancestor. MytilusmtDNA is also noteworthy for its gene arrangement, which is radically different from that of Curresponding author: Donald T. Stewart, Department of Biology, Dalhousie University, Halifax, Nova Scotia B3H 451, Canada. E-mail: [email protected] Genetics 143: 1349-1357 (July, 1996)

other metazoans (HOFFMANN et al. 1992). Recently, the complete mtDNA sequence was obtained for another mollusk, the chiton Katharinatunicata (BOORE and BROWN1994a). In contrast to Mytilus, the gene order of Katharina mtDNA is similar to arthropod, chordate, and echinoderm mtDNA (BOOREand BROWN1994b). Highly divergent gene ordermay not be unique to Mytilus, however. The protein and rRNA gene order in two land snails, Cepuea nemoralis (TERRETT et al. 1994) and Albinaria coeruka (HATZOGLOU et al. 1995), is different from the usual arrangement of Katharina as wellas from that of Mytilus. HOFFMANN et al. (1992) have also noted that Mytilus mtDNAis much more divergent from vertebrate or other invertebrate mtDNA at the sequence level than the lattertwo are from each other. Following the discovery of DUI in Mytilus several authors observed that the M type is more divergent than the F type (SKIBINSKI et al. 1994b; RAWSON and HILBISH 1995; STEWART et al. 1995). A similar observation was made by Lru et al. (1996) in the freshwater mussel Anodonta (= Q g a n e don) grandis, which also possesses doubly uniparental mtDNA inheritance. All these authors suggested that because the male germ line is the only tissue where the M type occurs in isolation from the F type,selective constraints on the M type may be relaxed relative to the F type and this may lead to a higher rate of nucleotide substitution in the M lineage. However,as noted by these authors, several other hypotheses can be proposed to account for the accelerated rate of evolution

1350

D. T. Stewart et al.

of the M genome. These include (1) a larger number of genome replications during spermatogenesis, (2) a greater level of free-radical damage in sperm, (3) positive selection on theM genome, and (4) a smaller effective population size for the M genome. The primary objective of this paper is to evaluate the various hypotheses that could account for the faster evolutionary rate of the M lineage in the M. edulis species complex. To this end we obtained additional sequence information for the most common F and M typesof M. edulis and M. trossulus by extending the known nucleotide sequence of the COZII gene and by obtaining the nucleotide sequenceof a segment of the mtDNA genome for which no function has yet been assigned (segment #2; HOFFMANN et al. 1992). This information was used to reaffirm the faster rate of evolution of the M lineage and totest whether the ratio of synonymous to nonsynonymous substitutions in the COZZIgene in the two gender-associated lineages is compatible with the model of neutral molecular evolution. To furthercharacterize the natureof amino acid substitutions in the Mytilus sequences, we compared the pattern of amino acid substitution in the M and F lineages with the pattern of variation found in COZZI amino acid sequences from a cross-section of mollusk species and Drosophila. We conclude that the M mtDNA lineage is experiencing relaxed selection relative to the F lineage and suggest that the difference of selection constraint on the two lineages may be a consequence of DUI. MATERIALSAND

METHODS

Sequencing protocol: Methods for obtaining the sequence of an 813-bp segment of the cytochrome c oxidase subunit I11 (COZZZ) for the most common M and F mitotypes of M. edulis and M. trossulus were described previously (STEWAKT et al. 1995). These four types, formerly known as FB and M in M. edulis and as N and S in M. trossulus (FISHER and SKIBINSKI 1990; ZOUROS et al. 1992), have been renamed F-edl, M-edl, F-trl and M-trl, respectively, to denote their gender andspecies affiliation (STEWART et al. 1995), Briefly, samples were et al. 1995) amplified using Mytilus-specific primers (STEWART and sequenced (in both directions) employing Taq dye-terminator chemistry on anAB1 automated sequencer. Amino acid sequences were inferred from the nucleotide sequences using the Drosophila mtDNA genetic code as recommended by HOFFMANN et al. (1992) and BOOREand BROWN(1994a). The corresponding COZZZ sequences for the following taxa were obtained from the GenBank database yr communicated to us directly: the bivalve Lasaea australis (0 FOIGHII.and SMITH 1995), the chitonK. tunicata (BOOREand BROWN1994a), the land snail Albinan‘a tum‘ta (G. RODAKIS,unpublished results), the cephalopodOctopus birnaculatus(BAKRIGA SOSA et al. 1995) and WOLSTENHOLME 1985). and thefruit fly D.yakuba (CLARY The Mytilus mitotypes were also sequenced for a segment of the mtDNA genome (located between the cytochrome b and qtochrorne c oxidase subunit I1 genes) for which no function has yet been assigned (i.e., segment #2; HOFFMANN et al. 1992). Two primers, Seg#S-FOR (5’-GGAAAGMTACCCACACTC3 ‘ ) and Seg#2-REV (5‘-CCTCAGCTATAAAACCCTA-3’), were used to amplify and sequence this region in both directions as described above.

Genetic distance and phylogenetic analysis: Sequences were aligned using the ClustalV computer program and analyzed using the MEGA package (KUMARet al. 1993). The COZZZnucleotide sequences were used to obtain theJukes-Cantor corrected subproportion of synonymous (K5) and nonsynonymous (h) stitutions for all pair-wise combinations. Similarly, the Poissoncorrected proportion of amino acid substitutions was obtained for all pair-wise combinations of sequences shown in Table 1. Pairwise proportions of nucleotide substitutions (K) for segment #2 were calculated using Kimura’s two-parameter correction for multiple hits. Gaps in the sequences were eliminated in painvise comparisons. Unrooted trees were constructed for the fourMytilus mitotypes from the COZZZamino acid sequence data and from the segment #2 nucleotide sequence datausing both parsimony (Paup 3.1.1; SWOFFORD 1991) and neighborjoining methods (SAITOUand NEI 1987). To determine the level of support for resulting phylogenetic groupings, both the parsimony and neighbor-joining analyses were conducted on 1000 bootstrapped data sets. Secondarystructure analysis: Potential secondary structures formed by the nucleotide sequences of segment #2 were (1981) using estimated by the methodof ZUKER and STEI(:I.ER the program PCFOLD version 4.0. Observed free energy values for the four main Mytilus mitotypes were compared with 10 randomized sequencesof the same length and proportion of nucleotides. Nucleotide substitution tests: & and KA values for the two M mitotypes (M-edl and M-trl) and the two F mitotypes (Fedl and F-trl) were used to compare synonymous and nonsynonymous nucleotide substitution ratesof the M and F mtDNA lineages at the COIZZgene. In the absence of a suitable outgroup, these numbers were compared by assuming that because the M and F types were sampled from two distinct species, the time of divergence of the two M mitotypes from each other is equal to the time of divergence of the two F mitotypes. We used MCDONALD and KF~IETMAN’S (1991) test to examine whetherthe pattern of nucleotide substitution is compatible with the model of neutral evolution. Neutral evolution predicts that theratio of the numberof sites that arepolymorphic within one (or more than one) lineage to the number of sites that are fixed for different nucleotides in separate lineages must be the same for synonymous and nonsynonymous sites. For this purpose, we used the data given in STEWART et al. (1995) that consist of 321 bp of the COZIZ gene sequenced from three Fed, four F-tr, two M-ed and three Mtr mitotypes. The M sequences from M. edulis and M. trossulus were regarded as representing two lineages of the same cluster. The same assumption was made for the Fsequences. MCDONALD and KREITMAN’S test requires a random sample of sequences. Although the sample we have used was not random, this should not have a significant effect on the result of the test since STEWAKT et al. (1995) made a conscientious effort to sequence a cross-section of the diversity of mitotypes present in these species. Amino acid Substitution tests: An amino acid site was classified as “conservative” if it was occupied by the same amino acid residue in all non-Mytilus species (Table 1 ) . Otherwise it was classified as “variable”. The distribution of conservative and variable sites and the distribution of amino acid replacements along the COZZZ gene was examined by dividing the gene into eight segments, seven of 25 amino acid sites and one of 21 amino acid sites. This division, starting from site #15 and ending at site W220 ( i e . , the first and last sites for which amino acids for all eight sequences were available; Table l ) , provides a compromise between having enough segments for a meaningful comparisonand having each segment large enough to be representative of a substantial portion of the gene. The following information was recorded for each segment: (1) the number of conservative sites (as defined

1351

Molecular Evolution in Mytilus

TABLE 1 Ammo acid alignment of the cytochrome c oxidolse subunit III gene of four Mytilus sequences with four other mollusk species and D. yakuba

Fed1 F-trl M-ed 1 M-trl Lasaea Katharina Albinaria Octopus Drosophila

PSPWPFFVAI SANGMAVGLI LWLHRTP-SF LLMGMSLVCM LLSTFSWWRD LIREGD-IGF [ 6 0 ] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . G.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . G. . . . .L..... . . . . . S...Y . . . . . . IS.. M . . . . . . . . . . . . . . . .M.. . . . . . .L.G. . . . . S..... . . . . . S...Y . . . S.MMG.. I..M....... . . . . . . . .L . . . . .IVAS. A.LTLVS.IL S.V.NGAVVL E.FLIAFILL G.TMVA..G. V.N.STYL.C F....LVGSMG.FCLT . . .A A.F:-GF.L F.VWVGVIL1 . . TMVQ . . . . V....TFQ.Y Y....LL.SL .LTS.PL... YYIR”FG.1 Y.LLYGMLLT SIIAYM . . . . IV..ATYQ.H ? ? ? ? ? ? ? ? ? ? ????LTT..S S.F.NYEN.- -.IFFG.LL. I.TMIQ . . . . I...STFQ.. Y....LTG..G.MTTVS.MV K.F:-QYDI S.FVLGNIIT I.TVYQ . . . . VS . . . TYQ.L

F-ed 1 F-tr 1 M-ed 1 M-tr 1 Lasaea Katharina Albinaria octopus Drosophila

HTRFVIKSFR DGVALFILSE VMFFFTFFWT FFHNALSPSC ELGMRWPPPG IRTPNPSSTS [1201

Fedl F-trl Med1 M-tr 1 Lasaea Katharina Albinaria Octopus Drosophila

LFETGLLISS GLFVTQAHKS MRLKDYDVGP FIGLVVTIVC GTVFFLVQLR EYYWNSYTIA [180] . . . . . . . . . . .YS . . . . . . . . . . . . . . . . . . . . . . . . . L. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LM..L. .A . . . . . .V. . . C....S.. .................... 1.A . . . . . . . I...L...L. .I . . . . . .V. . . . . . . . . .S .LN.AV.L.. .VS..W..YA I.DWN-RTQA1EA.SI.VIL .CW.T.L.AE ..HSA.F..S .LN.AV.LA. .VS..W..H. L1DG.-QG.A N.S.LT.VIL .AY.TFL.AG . . LET.F . . . .LN.SV.LL. .VSI.W..HA LTEGK.YS-T LS..FF.VLL .VY.LML.YG ..NET.FS.. .LN.AI.LA. .VS..W..H.LMNNNLKSAT HSMII-..SL .FY.TIL.ML ..MEA.FS.S .LN.AI.LA. .VT..W..H. LMENN-HSQT TQ..FF.VLL .IY.TIL.AY . . IEAPF . . .

Fed1 F-tr 1 M-ed 1 M-trl Lasaea Katharina Albinaria Octopus Drosophila

DSVYGSVFYL LTGFHGMHVV VGTIWLMVSL VRLWRGEFSS -QRHFGFEAC IWYWHFVDVV . . . . . . . . . . . . . . . . . . . . . . . L...... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A... . . .V..L... G . . . . . . . . . . . . . . .L... . . . . . . . . . . .GS...L.FVM . . . . . . . .L I..LF.L.G. ..TI.YH..VGHN.V.L.VA . . . . . . . . . . . .C...T.FVA . . . . . F..L ..SLF.L.T.W.NFSCH . . . SH- . . . . . .A A. . . . . . . . . .G....T.FMA . . . . .L..M . . . LF.F.N. ..TYYYH..T TH-.V..L.A A ......... . . I...T.FV A . . . . . L.. 1I.STF.FMC. L.ILMNH . . . S ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? . . I...T.FM A . . . . . I..L I..TF.L.C. L.HLNNH..K NH- . . . . . . A .A. . . . . . . .

Fed1 F-trl M-edl M-trl Lasaea Katharina Albinaria octopus Drosophila

WVALWCLVYV WFGG [ 25 41

................ Y................... Y...A...... . . . . . . . . . ..SL.ARGL. V.M ....... ..LR.SRGL. W.MI . . . T.. ..TY.V.GLK L.M ....... ..SK.YNGL. W.MM . . . I.. ,.YA.TIGL. W.MI ......

. . . . . s.... . . . . . . . . . . . . . . . . . . . . . S...A . . . . . . . . . . . . . . . . . . . . . . . . . S...A . . . . . . . . . . . . . . . . . . . . .F..VS...A . . . LS.GS-L SE.GS . . . M.

.........

.......... ..........

.LPI.AFGVP .L...A...AY..SS.A.CL .I.SC...Q.VAPL..FQVP .C...A...AY..SS.A..I .I.SV...I..TVL.VFQVP .C...A...AY..SS.A.NM DI.SC . . . IY .FPL..FQIP .L..VS...A . . . SS . . .AI . . .AS . . .M. .ISF..FQIP

[240]

.............. . . S..FV... . . . . . . T...V... . . . . . . F . F IF . . - . W . S .LF.YISI.- .W.S .LF.YISI.-.W.S

?????????? ????

. LF . YITI . - .W. .

F-edl, F-trl,M-edl and M-trl indicate the gender andspecies affiliation of the fourmost common mitotypes of Mytilus. Dots represent agreement with F e d l . ?, sequence information was not available at this site. above), which can be considered as an index of the “conservativeness” of the segment; (2) thenumber of sites at which the two M sequences had different amino acids ( i e . , polymorphic sites within the M lineage); and (3) the number of sites at which the two F seauences had different amino acids (2.e.. \ , polymorphic sites widin the F lineage). RESULTS

Higherrate of molecularevolution in the M lineage: The sequences for 813 bp of the COIII gene

(-85% of the gene) and 118 bp of the region of unassigned function for the most F and M mitotypes of M. edulis and M. trossulus have been deposited in the GenBank database (accession numbers u50212U50219). Table 1 presents the inferred COIII amino acid sequences of the M and F Mytilus mitotypesaligned against the four othermollusk species and Drosophila. Amino acid distances for all pair-wise amino acid sequences are given in Table 2. Phylogenetic analysis of

1352

D. T. Stewart et al. TABLE 2 Poisson-corrected amino acid distances of the sequences presented in Table 1 Fed1

Fed1 F-tr 1 0.0224 M-ed 1 0.0927 M-tr 1 0.1298 Lasaea 0.7949 Katharina 0.7346 Albinaria 0.7868 octopus 0.8490 0.8607 Drosophila 0.7517

F-tr 1

M-edl

M-trl

Lasaea

Katharina

Albinaria

Octopus

Drosophila

0.0091

0.0189 0.0194

0.0226 0.0222 0.0202

0.0700 0.0683 0.0700 0.0677

0.0664 0.0653 0.0659 0.0669 0.0522 -

0.0697 0.0680 0.0692 0.0709 0.0599 0.0492

0.4700 0.4130

-

0.0824 0.0816 0.0816 0.0824 0.0663 0.0504 0.0592

0.5326

-

0.0675 0.0664 0.0675 0.0664 0.0535 0.0464 0.0513 0.0533

-

0.0968 0.1257 0.7685 0.7178 0.7604

-

0.1050 0.7949 0.7598 0.7262 0.7432 0.7779 0.8049 0.84900.6354 0.8607 0.45 0.7346 0.5028 0.4299 0.5367 0.7346 0.7517

-

0.5162 0.634'2

13

-

Distances are given below the diagonal, standard errorsabove the diagonal. Gaps and themissing sites in the Octopussequence were removed only in pairwise comparisons.

the aminoacid sequences and thenucleotide sequences of segment #2 for the M and F mitotypes (Figure 1) produced results similar to those previously published (RAWSON and HILBISH 1995; STEWART et al. 1995), i.e., (1) the four Mytilus sequences cluster by gender rather than by species (with bootstrap support 297%) and(2) the M lineage evolves faster than the F lineage. The relative-rates test (WU and LI 1985) is the most appropriate way for comparing nucleotide substitution rates along independently evolving lineages, but this test requires the use of an outgroup that is closely related to the clade to which the compared lineages belong. For Mytilus, the closest available COIZI sequence

A 0.01 1 7 F-edl

I

0.068

M-el

FIGURE 1.-Unrooted neighbor-joining trees of the two major male ( M e d l and M-trl) and the two major female (F-edl and F-trl) mitotypes in M. edulis and M. trossulus, respectively. Genetic distance matrices used to produce these trees were based on (A) 271 amino acid residues of the qtochrome c oxidase subunit I11 gene and (9) 119 bp of nucleotide sequence from aregion of unassigned function. Numbersindicate branch lengths (unbracketed) and percentagebootstrap s u p port (bracketed).Parsimony analyses yielded the same topolcgies.

is that of the bivalve L. australis ( 0 FOICHILand SMITH 1995). Thelineages leading to Lasaea and Mytilus have been separated by more than 500 million years and the average nucleotide distance between the Lasaea and the Mytilus F or M COIII gene is 0.56 (STEWART et al. 1995). The amino acid divergence for the same gene is 0.78 (Table 2) and, indeed, this divergence is not different than thedivergence of Mytilus from Drosophila or from the other mollusks. The high degree of divergence invalidates the use of any of these sequences in the relative-rates test. However, the substitution rates can be comparedinour case because the F and M lineages are foundin each of two closelyrelated species. This allowsus to assume that the F sequences of M. edulis and M. trossulus diverged from each other at the same point in time as the M sequences of these two species. Under this assumption, the Poisson-corrected number of amino acid substitutions per site (A) is directly proportional to the substitution rate, so they can be compared directly. The distances between F-edl and F-trl and between M-edl and M-trl are AF = 0.022 ? 0.009 and AM = 0.105 ? 0.020, respectively.Since these two values do notoverlap within ?2.8 SE, the probability that they are equal is