Gene Duplication, Gene Conversion and the Evolution of the Y ...

1 downloads 0 Views 1MB Size Report
of the Y Chromosome. Tim Connallon1 and Andrew G. Clark. Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853-2703.
Copyright Ó 2010 by the Genetics Society of America DOI: 10.1534/genetics.110.116756

Gene Duplication, Gene Conversion and the Evolution of the Y Chromosome Tim Connallon1 and Andrew G. Clark Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853-2703 Manuscript received March 17, 2010 Accepted for publication May 31, 2010 ABSTRACT Nonrecombining chromosomes, such as the Y, are expected to degenerate over time due to reduced efficacy of natural selection compared to chromosomes that recombine. However, gene duplication, coupled with gene conversion between duplicate pairs, can potentially counteract forces of evolutionary decay that accompany asexual reproduction. Using a combination of analytical and computer simulation methods, we explicitly show that, although gene conversion has little impact on the probability that duplicates become fixed within a population, conversion can be effective at maintaining the functionality of Y-linked duplicates that have already become fixed. The coupling of Y-linked gene duplication and gene conversion between paralogs can also prove costly by increasing the rate of nonhomologous crossovers between duplicate pairs. Such crossovers can generate an abnormal Y chromosome, as was recently shown to reduce male fertility in humans. The results represent a step toward explaining some of the more peculiar attributes of the human Y as well as preliminary Y-linked sequence data from other mammals and Drosophila. The results may also be applicable to the recently observed pattern of tetraploidy and gene conversion in asexual, bdelloid rotifers.

N

ONRECOMBINING chromosomes are often associated with genetic degradation and a loss of functional genes, and nowhere is this pattern more exaggerated than on the Y chromosome (Charlesworth and Charlesworth 2000; Bachtrog 2006). However, in addition to the more widely recognized pattern of gene loss, genome sequences of mammals and Drosophila are also yielding evidence for Y-linked functional gene gain followed by amplification of duplicate genes (Skaletsky et al. 2003; Koerich et al. 2008; Carvalho et al. 2009; Krsticevic et al. 2009; Hughes et al. 2010). Duplication and retention of functional Y-linked gene copies is somewhat surprising because evolutionary theory predicts an opposing pattern. First, to the extent that gene duplicates are fixed via positive selection, they are less likely to become fixed on nonrecombining relative to recombining chromosomes (Otto and Goldstein 1992; Clark 1994; Yong 1998; Otto and Yong 2002; Tanaka and Takahasi 2009). Second, regardless of whether Y-linked duplicates become fixed via genetic drift or by natural selection, the actions of Muller’s ratchet, genetic hitchhiking, and background selection are expected to greatly increase the probability that Y-linked genes degenerate into nonfunctional

Supporting information is available online at http://www.genetics.org/ cgi/content/full/genetics.110.116756/DC1. 1 Corresponding author: Department of Molecular Biology and Genetics, Cornell University, Biotechnology Bldg. (Room 227), Ithaca, NY 148532703. E-mail: [email protected] Genetics 186: 277–286 (September 2010)

pseudogenes (Charlesworth and Charlesworth 2000; Bachtrog 2006; Engelstadter 2008). The issue is more complex when one considers data from the well-characterized human Y chromosome. A majority of functional Y-linked genes are members of duplicate gene pairs residing within large palindromes and are almost exclusively testis expressed (Skaletsky et al. 2003). In contrast to many of the single-copy genes with X-linked homologs, members of Y-linked gene families are apparently not degenerating, but rather have become fixed and maintained over many millions of years (Skaletsky et al. 2003; Yu et al. 2008). Although Y chromosomes are not well characterized in other taxa, currently available data suggest that duplication is a common feature of Y chromosomes in other mammal species as well as Drosophila (Rozen et al. 2003; Verkaar et al. 2004; Murphy et al. 2006; Alfo¨ldi 2008; Wilkerson et al. 2008; Krsticevic et al. 2009; Geraldes et al. 2010). Thus, patterns of gene duplication and retention, for at least a subset of Y-linked genes, may be a general rule of Y chromosome evolution. Another attribute of the mammalian Y appears to be relevant for duplicate gene evolution. Comparative analysis between humans and chimpanzees suggests ongoing recombination between the gene duplicate pairs that reside on the same Y chromosome. Such ‘‘intrachromosomal’’ recombination includes both nonreciprocal (gene conversion) and reciprocal exchange (crossing over) between gene duplicate pairs (Rozen et al. 2003; Lange et al. 2009). Gene conversion between

278

T. Connallon and A. G. Clark

the duplicates potentially maintains gene function by counteracting stochastic forces of Y chromosome degeneration (Rozen et al. 2003; Charlesworth 2003; Noordam and Repping 2006). The rationale behind this hypothesis is subtle. As with other clonally inherited chromosomes, each evolutionary lineage of the Y is physically coupled to, and its evolutionary fate is influenced by, the presence of deleterious mutations. Mutationbearing lineages represent evolutionary dead ends unless they can somehow remove or compensate for deleterious mutations. Recombination between duplicates can ‘‘rescue’’ functionality via gene conversion between functional and nonfunctional copies. On the other hand, double-strand DNA breaks, which precede gene conversion events (Marais 2003), also precede crossing over. Crossovers between Y-linked genes can generate acentric and dicentric Y chromosomes, resulting in infertility and disruption of the sex determination pathway (e.g., Repping et al. 2002; Heinritz et al. 2005; Lange et al. 2009). Considering both gene conversion and crossing over on the Y, recombination can be viewed as a factor that either constrains (via gene conversion) or promotes (via crossing over) Y chromosome degeneration. These observations concerning Y chromosome gene content and recombination raise interesting questions that have not been formally addressed by evolutionary theory (but see the recent study by Marais et al. 2010). First, what conditions favor the evolutionary invasion of Y-linked gene duplicates, and does recombination influence the probability that duplicates eventually become fixed within a population? Second, what affect does recombination have on Y-linked fitness and the maintenance of functional duplicate genes? To address these questions, we develop and analyze a series of populationgenetic models of Y chromosome evolution. We show that, when direct selection on gene duplicates is weak, biased gene conversion can increase, whereas crossing over will decrease, their probability of fixation. For duplicates with larger fitness effects, the probability of fixation is largely independent of Y-linked recombination. Finally, gene conversion has a major impact on the retention of functional Y-linked genes that are already fixed within the population and maintains multiple gene copies with or without selection favoring these duplicates.

develop and analyze a diffusion approximation and perform stochastic simulations to examine the probability that a rare gene duplicate eventually becomes fixed within a population of small size. Invasion of a new gene duplicate: Consider a single Y-linked locus with a functional allele, A, and a nonfunctional allele, a. Mutation from A to a occurs at rate u per generation and there is no back mutation. By introducing a duplication of the locus, the population is expanded to include five genotypic classes: the original single-copy classes (A and a), those with two functional gene copies (AA), those with one functional and one nonfunctional copy (Aa), and those with two nonfunctional copies (aa). As in the single-locus case, transitions between states (AA / Aa or aA; Aa or aA / aa) can occur by mutation, at rate of u per locus; because there are now two loci, the mutation rate per chromosome is 2u. For Y chromosomes carrying duplicates, recombination (crossing over and gene conversion) can potentially occur between loci. Throughout our analysis, we examine cases where recombination occurs at a rate of d per paralog pair, per generation. The probability that a single recombination event is a crossover, which generates an abnormal (sterile) Y chromosome (e.g., Repping et al. 2002; Heinritz et al. 2005; Lange et al. 2009), is equal to the constant c . The remainder of recombination events (1  c) represent gene conversion events between duplicate pairs. Gene conversion involving Aa or aA individuals yields AA or aa sperm at rate b and 1  b, respectively. Thus, b can be viewed as a biased gene conversion parameter, where the functional copy A preferentially replaces the nonfunctional a whenever b . 0.5 (there is no bias when b ¼ 0.5). Compared to individuals with two functional gene copies, individuals with zero functional copies suffer a fitness reduction of s, while those with one functional copy suffer a reduction of sh, where h is equivalent to a dominance coefficient. Complete masking of a nonfunctional allele occurs when h ¼ 0, and there is no direct fitness benefit of carrying two vs. one functional gene. Partial masking occurs when 1 . h . 0; in such cases, there is a fitness benefit of having two functional copies. Genotypes, genotypic fitness, and zygotic frequencies are described in Table 1.

MODEL AND RESULTS

TABLE 1

Gene conversion and the invasion of new gene duplicates: We first consider conditions favoring the evolutionary invasion of new Y-linked duplicate genes at low initial frequency within the population. Deterministic invasion dynamics are described for a two-locus model, and it is shown separately that the two-locus model characterizes duplicate gene invasion conditions on a Y chromosome carrying an arbitrary number of genes (see supporting information, File S1). We then

Parameterization for the gene duplicate invasion model Genotype AA Aa, aA A aa a Abnormal Y

Frequency x11 x10 x1 x00 x0 xs

Fitness 1 1 1 1

1  sh  sh s s 0

Gene Conversion and Y Evolution

For a sequence of events of (i) birth, (ii) selection, (iii) mutation, (iv) recombination, and (v) random mating (and ignoring factors of u2), the frequency change of each genotype, per generation, is given by the following six recursions, x11 ð1  2uÞð1  dcÞ x119¼ w x11 x10 ð1  shÞ 2udð1  cÞb 1 ð1  uÞdð1  cÞb 1 w w x109 ¼

x009 ¼

x11 x10 ð1  shÞ 2uð1  dÞ 1 ð1  uÞð1  dÞ w w x11 x10 ð1  shÞ 2udð1  cÞð1  bÞ 1 uð1  dcÞ w w x10 ð1  shÞ x00 ð1  sÞ ð1  uÞdð1  cÞð1  bÞ 1 ð1  dcÞ 1 w w

xs 9 ¼

x11 x10 ð1  shÞ x00 ð1  sÞ dc 1 dc 1 dc w w w

x1 9 ¼

x1 ð1  shÞ ð1  uÞ w

 @l  1  d 1 Oðd 2 Þ  1 1 dð2b  1Þ; @d d¼0 d¼0

ð1  2uÞð1  dcÞ 1 2udð1  cÞb 1 ð1  shÞð1  uÞð1  dÞ l¼ 2ð1  shÞð1  uÞ vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u u fð1  2uÞð1  dcÞ 1 2udð1  cÞb 1 ð1  shÞð1  uÞð1  dÞg2 t  4ð1  2uÞð1  dcÞð1  dÞð1  shÞð1  uÞ : 1 2ð1  shÞð1  uÞ

ð1aÞ Selection favors the invasion of a duplicate when the leading eigenvalue is greater than one (Otto and Day 2007). The magnitude of the leading eigenvalue also represents the strength of selection acting in favor of a rare duplicate gene [i.e., the probability of fixation is proportional to l (Otto and Bourguet 1999; Otto and Yong 2002); see below for additional details]. Without recombination (d ¼ 0), the leading eigenvalue reduces to l¼

1 1  2u 1 jshð1  uÞ  uj 1 2 2ð1  shÞð1  uÞ

when the direct fitness benefit of additional functional gene copies outweighs the indirect consequences of doubling the deleterious mutation rate, as previously reported for both haploid and diploid systems without recombination (Clark 1994; Otto and Yong 2002; also see Otto and Goldstein 1992). How does recombination alter the evolutionary dynamics of Y chromosomes? When duplicates do not directly increase fitness (sh ¼ 0), and there is no recombination, selection never favors invasion (Equation 1b above). We can ask whether gene conversion expands the conditions favorable to invasion of a duplicate in a way that is similar to previous models of gene duplication with crossing over (Otto and Yong 2002). By permitting Y-linked recombination between duplicates, and assuming that the crossover rate is zero (dc ¼ 0; hence, all recombination is by gene conversion), the leading eigenvalue can be approximated for low rates of gene conversion (d  0, per generation),   l ¼ l

x1 ð1  shÞ x0 ð1  sÞ u1 ; x0 9 ¼ w w where mean fitness is w  ¼ x11 1 ðx10 1 x1 Þð1  shÞ 1 ðx00 1 x0 Þð1  sÞ. To describe conditions promoting the invasion of duplicates, we analyzed the stability of an evolutionary equilibrium in which duplicated genotypes are absent from the population. Under such a condition, the frequencies x1 and x0 equilibrate to xˆ1 ¼ 1  uð1  hsÞ= ½sð1  hÞ ¼ 1  xˆ0 and the leading eigenvalue of the stability matrix is

ð1bÞ

and evolutionary invasion of a duplicate-bearing Y is favored when sh . u/(1  u). Duplicates are favored

279

ð1cÞ

which indicates that selection favors duplicates (l . 1) when gene conversion is biased toward transmission of functional over nonfunctional gene copies (b . 0.5). Numerical evaluation of Equation 1a indicates that, although higher rates of gene conversion can increase the leading eigenvalue (and hence the probability of invasion), this positive relationship quickly saturates. Thus, a little bit of gene conversion has about as much of an impact on the leading eigenvalue as a high rate of gene conversion does. Nevertheless, the strength of such positive selection (with magnitude of l  1) is on the order of the mutation rate (u) and is therefore extremely weak. Stochastic simulations (see below) show that the probability of duplicate fixation is marginally influenced by biased gene conversion alone. Further analysis of Equation 1a shows that, as with the case of no recombination (Otto and Yong 2002), selection will favor duplicates if they directly increase fitness (sh . 0). Gene conversion (including unbiased gene conversion: b ¼ 0.5) can increase the strength of selection favoring invasion of a duplicate (l  1; Figure 1). However, the relative impact of gene conversion is minor when sh ? u. In other words, when there are weak direct benefits of having multiple gene copies, the strength of natural selection favoring Y-linked gene duplicates will be enhanced by gene conversion between paralogs. This conclusion holds if the crossover rate between duplicate pairs (dc) is small (Figure 1). As the rate of crossing over increases, the production of abnormal Y haplotypes can generate purifying selection against Y chromosomes that carry gene duplicates. Why should gene conversion broaden duplicate invasion conditions under weak selection? An intuitive explanation can be reached by considering the recursion

280

T. Connallon and A. G. Clark

the duplicate is favored when ½1  2uð1  dbÞ=w  . 1. Invasion is clearly facilitated by gene conversion (db . 0). Nevertheless, because the term 2u(1  db) is extremely small, gene conversion will marginally influence the probability of fixation whenever sh ? u. Probability of duplicate fixation: The deterministic model presented above can be modified to describe the evolutionary dynamics in finite populations. Following Otto and Bourguet (1999) and Otto and Yong (2002), the selection coefficient for a rare gene duplicate can be approximated as l  1, where l is the leading eigenvalue of the stability matrix (Equation 1a, above). Given this selection coefficient, the probability that a rare duplicate is eventually fixed can be estimated by diffusion approximation (Kimura 1957, 1962), with drift and diffusion coefficients M ¼ (l  1)x(1  x) and V ¼ x(1  x)/N, respectively, where x is the frequency of a duplicate-bearing Y haplotype and N is the Y chromosome effective population size. For an initial frequency of 1/N, the probability that a duplicate is fixed will be PrðfixationÞ ¼

Figure 1.—Gene conversion can enhance the strength of positive selection for rare duplicate genes, whereas crossovers select against duplicates. Selection coefficient approximations (l  1) are based on the leading eigenvalue (Equation 1a), as described and justified in the text, and are presented as a ratio of selection with (d . 0) vs. without recombination (d ¼ 0). Representative results are presented for u ¼ 105 and assume that there is no gene conversion bias (i.e., b ¼ 0.5).

dynamics for a population fixed for the single-gene haplotype. Because this explanation is heuristic, we ignore crossovers and assume that they do not occur (c ¼ 0). The rate of increase for a rare haplotype with two functional gene copies depends on its relative competitiveness against the resident, single-copy haplotype. For initial condition x11 ¼ 1/N and x10 ¼ x00 ¼ 0, the expected proportion of functional duplicate haplotypes (x11) within  and the gamete pool is E½x11 9 ¼ x11 ½1  2uð1  dbÞ=w,

1  e 2ð1lÞ 2ðl  1Þ  : 2N ð1lÞ 1  e 2N ð1lÞ 1e

ð2Þ

To assess the validity of Equation 2, we conducted computer simulations that incorporate mutation, selection, and genetic drift. Each simulation was initiated at x11 ¼ 1/N, x0 ¼ u(1  hs)/(s  sh), and x1 ¼ 1  x11  x0. To generate genotypic frequencies for the next generation, N genotypes were randomly drawn from a multinomial distribution, after selection, from the six genotypes described above. Mutation–selection–drift recursions were iterated until the duplicate genotype was either fixed or lost from the population. Equation 2 provides a good approximation for the probability of duplicate fixation over a broad range of parameter space (Figure 2 and Figure S1). As direct selection on a duplicate approaches zero (sh / 0), the probability of fixation approaches 1/N. As direct selection increases in strength (1 ? 1  l ? 1/N), the probability of fixation approaches 2(l  1). Gene conversion had little impact on the probability of duplicate fixation (see Figure S1). As shown above, the leading eigenvalue of the stability matrix is not substantially influenced by gene conversion unless sh is of similar order to u. Even though the selection coefficient approximation (l  1) can increase with gene conversion, its absolute magnitude under weak direct selection (sh  0) will generally be too small for natural selection to be effective, unless of course Nu . 1, which is particularly unlikely for Y-linked loci. Thus, gene conversion is unlikely to significantly enhance the rate of duplicate gene fixation, but can potentially reduce the fixation rate of duplicates if the rate of deleterious crossovers between paralogs is high. Gene conversion and the maintenance of gene duplicates: A major hypothesis inspired by the human Y chromosome is that gene conversion between duplicates

Gene Conversion and Y Evolution

Figure 2.—The probability of fixation for Y-linked duplicate genes. The solid line depicts the analytical approximation from Equation 2. Circles represent the proportion of duplicate genotypes (out of 100,000 replicate simulations for each data point) that eventually become fixed within the population. Results are shown for d ¼ 0, N ¼ 1000, and u ¼ 105, per locus, per generation. Values of d . 0 yield approximately the same results (see Figure S1).

may prevent the accumulation of mutations and ultimately prevent or slow down Y chromosome degeneration due to Muller’s ratchet (Charlesworth 2003; Rozen et al. 2003; Noordam and Repping 2006). To formally evaluate this possibility, we considered two models for the maintenance of functional Y-linked genes. We first conducted simulations of our two-locus model with initial condition x11 ¼ 1 (a pair of functional duplicates is initially fixed within the population) and analyzed whether gene conversion prevented the loss of one or both of the functional gene copies. Gene conversion between Y-linked paralogs decreased the rate of gene loss under a wide range of fitness conditions, including the extreme case where there was no direct benefit of having two, as opposed to one, functional gene copies (Figure S2). Although gene conversion can substantially reduce the rate of gene loss, the results indicate that loss of completely redundant genes (where sh ¼ 0) will persist under gene conversion, albeit at a substantially reduced rate. Prior models of Muller’s ratchet generally find that the rate at which deleterious mutations become fixed depends upon both the strength of purifying selection and the number of loci evolving on an asexual chromosome (Charlesworth and Charlesworth 2000; Bachtrog 2008). To account for selection and gene conversion across many loci, we extended our model to describe the degeneration of Y chromosomes carrying an arbitrary number of genes. To permit gene conversion, we assumed that each Y initially carries n distinct gene types, each with a duplicate copy (for a total of 2n loci). Because the increased number of genes greatly expands the number of possible genotypic and fitness states (and consequently the matrix of transition prob-

281

abilities between states), we made a simplifying assumption that each of the n gene types represents an essential male fertility factor. Males lacking a functional copy of one or more gene types are sterile and comprise a heterogeneous genotypic class with reproductive success of zero. Although the essentiality assumption is useful for modeling purposes, it will often be biologically reasonable because Y-linked genes, at least in mammals and Drosophila, are often essential for male fertility. For example, human Y chromosome microdeletions within Y-palindromic regions are often associated with spermatogenic failure (Noordam and Repping 2006; Lange et al. 2009). In Drosophila melanogaster, mutations in at least three of seven currently Y-annotated genes (kl-2, kl-3, and kl-5, as well as an additional set of unannotated genes: kl-1, ks-1, and ks-2; data obtained from http://flybase.org/) are known to cause male-sterile phenotypes. Nevertheless, the overall agreement between our multilocus and two-locus results (the latter does not assume essentiality; see Figure S2) suggests that a violation of the essentiality assumption is unlikely to strongly affect our conclusions. For each paralog pair, there are three possible genotypes: both loci functional, one functional and one nonfunctional, and both nonfunctional. Transitions between genotypic states can occur by mutation, by gene conversion, or by crossing over, with crossover yielding an abnormal Y chromosome. For individuals carrying a structurally normal Y, fitness follows the function w ¼ (1  sh)k(0)j, where j refers to the number of gene pairs with both copies nonfunctional, and k refers to the number of pairs where one of the two gene copies is functional (0 # k # n). Individuals with j . 0 and individuals carrying abnormal Y chromosomes are sterile. After selection, the reproductive contribution of an individual with k Y-linked mutations is x kS ¼

xk wk ; w

where xk is the zygotic frequency of k-bearing males, wk ¼ and (1  sh)k is the fitness of a male with k mutations, P mean male fitness with respect to the Y is w  ¼ nk¼0 xk wk . (The reproductive contribution of sterile individuals is zero.) To facilitate analytical tractability, we assume that the rates of recombination and mutation are both small enough to ignore multiple mutation and multiple recombination events per generation. In other words, there is a zero probability of an individual with k mutations producing a fertile son with k  2 or k 1 2 mutations. This assumption is justified as long as 2nu > 1 and nd > 1, which requires that the mutation and recombination rate per locus is small, and the number of loci mutable to a nonfunctional allele is much smaller than the reciprocal of the mutation or gene conversion rate: n > min[1/u, 1/ d]. Because n represents a small fraction of Y-linked nucleotides (i.e., it represents a very specific functional

282

T. Connallon and A. G. Clark

class), this assumption is biologically reasonable. Nevertheless, a violation of these assumptions is expected to make our results conservative by downwardly biasing the speed of Muller’s ratchet (which is enhanced by a higher mutation rate) and minimizing the positive effect of gene conversion (higher gene conversion rates increasingly counteract Muller’s ratchet). Extending across the 2n loci, the probability that a Y chromosome experiences one mutation is Pr(M ¼ 1) ¼ 2nu ¼ U. The probability that zero mutations occur is Pr(M ¼ 0) ¼ 1  U. The probability of a recombination event between one of the n paralog pairs is Pr(R ¼ 1) ¼ nd ¼ D. The probability of no recombination is Pr(R ¼ 0) ¼ 1  D. Given a sequence of events of (i) birth, (ii) selection, (iii) mutation, (iv) recombination, and (v) random mating, the frequency of fertile males in the next generation follows the recursion xk 9 ¼

x

 shÞk1 U ðn  k 1 1Þ w n xk ð1  shÞk Uk 1 2nð1  U Þ  1 2n  w  Dð1  cÞðn  kÞ 11  D 3 n x ð1  shÞk U ðn  kÞ x ð1  shÞk11 k k11 1 1 w w n U ðk 1 1Þ 1 2nð1  U Þ  3 2n Dbð1  cÞðk 1 1Þ : 3 n k1 ð1

The ‘‘least-loaded’’ (k ¼ 0) and ‘‘most-loaded’’ (k ¼ n) classes of fertile males follow the recursion   x0 ð1  U Þð1  DcÞn 1 UDð1  cÞb x0 9 ¼ w n    x1 ð1  shÞ U Dð1  cÞb 1 11  U w 2n n and xn 9 ¼

xn1 ð1  shÞn1 U ð1  DÞ w n xn ð1  shÞn ð2  U Þð1  DÞ ; 1 w 2

respectively. The frequency of sterile males in the next generation (via crossover, mutation, or gene conversion) will be xs 9 ¼ 1 

n X

xk 9:

k¼0

Deterministic equilibria and mean fitness of the Y: When there is no recombination between duplicates (D ¼ 0),

Figure 3.—Gene conversion increases the frequency of Y chromosome haplotypes that carry zero deleterious mutations (i.e., the ‘‘least-loaded’’ genotypic class). The cost of a mutation eliminating function of a copy of each duplicate pair is represented by sh (this cost increases from left to right on the x-axis). The relative proportion of mutation-free Y chromosomes in recombining vs. nonrecombining populations is presented as a ratio of the two scenarios (gene conversion increases the proportion of mutation-free Y’s when this ratio is greater than one). The number of distinct, Y-linked genes is represented by n. Results are presented for c ¼ 0, b ¼ 0.5, and u ¼ 5 3 104, per locus, per generation, and D ¼ U ¼ 2nu. Additional results are presented in Figure S3.

mean Y chromosome fitness as well as the distribution of mutations among individuals can be analytically determined. If mutations that eliminate duplicate gene function are deleterious (sh . 0), and the number of unique Y-linked genes is large (n ? U/sh), the population approaches the equilibrium: xˆk  PoisðU=shÞ,   1  U . This is analogous to the case of xˆs  0, and w mutation–selection balance with incomplete dominance (sh . 0), with a Y-linked genetic load of L ¼ U  1  eU (e.g., Haldane 1937; Kimura and Maruyama 1966; Kondrashov and Crow 1988). When knocking out a duplicate yields no fitness effect (sh ¼ 0), or the number of Y-linked genes is small (n > U/sh), the population approaches the equilibrium: xˆn  1  U=2,   1  U =2. Under this scenario, the xˆs  U =2, and w genetic load is reduced by a factor of 2, to L ¼ U/2  1  eU/2 (Haldane 1937). Gene conversion between duplicates increases the frequency of the least-mutated class (Figure 3 and Figure S3), whether or not there is a gene conversion bias favoring functional over nonfunctional loci. The frequency of the least-loaded class represents a quantity of particular importance for adaptation on clonally transmitted chromosomes such as the Y (Charlesworth and Charlesworth 2000). Without recombination, the unit of selection is the chromosome rather than the locus. Beneficial mutations that are associated with mutation-free genetic backgrounds are relatively likely

Gene Conversion and Y Evolution

to become fixed (Peck 1994; Orr and Kim 1998) and do not permit hitchhiking of deleterious mutations during a selective sweep (Rice 1987). However, as the frequency of the least-loaded class becomes small, virtually all beneficial mutations will arise in inferior genetic backgrounds. This will limit the adaptive potential of the Y chromosome. Because it increases the fraction of mutant-free Y chromosomes, gene conversion is expected to enhance the fixation probability for beneficial mutations and can reduce the deleterious consequences of hitchhiking. By shifting the mutational distribution toward relatively mutation-free genotypes, gene conversion also increases mean Y chromosome fitness. This effect does not depend on a gene conversion bias, but can become exacerbated when conversion events favor functional over nonfunctional variants (for models yielding similar conclusions about the genetic load, albeit by different approaches, see Bengtsson 1986, 1990, and especially Ohta 1989). These long-term effects of gene conversion can be accounted for by a straightforward explanation. When the fitness cost of silencing both copies of a duplicate pair is much greater than the cost of silencing one of the copies (when duplicates partially or completely mask deleterious mutations: h , 0.5), selection across Y chromosomes mimics truncation selection, which is particularly efficient at removing deleterious alleles (e.g., Kondrashov 1988; Ohta 1989). Truncation selection arises because mutations on a relatively mutation-free Y will generally affect one copy of a pair, with the second, functional copy compensating for loss of the first. As the number of mutations on a Y increases, so does the probability of silencing the second copy of a pair. Consequently, the deleterious effect of each mutation increases faster than linearly with the number of mutations carried on a Y. Without recombination, the accumulation of mutations is unidirectional, and the population will tend to evolve toward the edge of the truncation point (n mutations at distinct genes), particularly if masking by duplicates is strong (i.e., having two functional copies provides the same fitness as one copy). At the extreme of sh ¼ 0 (complete masking), the population evolves to contain n functional genes, each distinct. Gene conversion restores variability by permitting bidirectional transitions (e.g., k to k  1 and k 1 1 mutations). Y chromosomes that are closer to the truncation point have a higher probability of transitioning (by mutation or recombination) beyond the truncation point where they are removed by selection. Consequently, the population distribution shifts toward fewer mutations per Y. However, if selection in favor of functional duplicates is strong relative to the number Y-linked genes (sh . 0; n large), most individuals will carry few mutations, the truncation point becomes irrelevant to Y chromosome evolution, selection shifts toward multi-

283

plicative epistasis, and gene conversion does not strongly influence mean fitness or the distribution of mutations among Y chromosomes. This explanation accounts for the decreased impact of gene conversion on mutation-free Y chromosomes, as the strength of selection (sh) increases (Figure 3 and Figure S3). Muller’s ratchet and the accumulation of nonfunctional genes: The deterministic results (presented above) represent an upper limit for Y chromosome fitness. In finite populations, where Muller’s ratchet operates, mean fitness can further decrease with each successive loss of ‘‘mutation-free’’ individuals. Once lost from the population, mutation-free genotypes are unlikely to be recovered by back mutation or positive selection because they must initially arise within the current least-loaded class and subsequently avoid stochastic loss (Peck 1994; Orr and Kim 1998; Gordo and Charlesworth 2000). To explore the influence of gene conversion on the rate and severity of Y chromosome degeneration via Muller’s ratchet, we conducted a series of stochastic simulations, varying the selection and recombinational parameters (u, h, n, d, c, b). We first use the recursions presented above to bring the frequencies of each genotypic class to deterministic equilibrium. Convergence to equilibrium is followed by 100,000 generations of simulation under a mutation–selection–drift model and constant male population size. For each generation, genotype frequencies were sampled from a pseudorandom multinomial distribution (pseudorandom numbers generated with R; R Development Core Team 2005), with genotypes randomly sampled after selection, mutation, and recombination. When there is no gene conversion between duplicates, Muller’s ratchet can operate rapidly, causing Y-linked fitness decay and loss of functional genes. Representative simulation results are shown in Figures 4 and 5. In agreement with previous theory (Haigh 1978; Gordo and Charlesworth 2000; Bachtrog 2008), the impact of the ratchet is strongest when the ancestral Y carries many functional gene duplicates and when mutations have small individual fitness effects. Relatively low rates of gene conversion can rescue Y-linked genes from stochastic loss via Muller’s ratchet and thereby increase mean fitness of the Y (Figures 4 and 5). Increasing the total mutation and gene conversion rates on the Y (U and D, respectively) amplifies the differences between recombining and nonrecombining chromosomes, whereas a decrease in these compound parameters (U, D / 0) eliminates these long-term evolutionary differences. This effect occurs both with and without biased gene conversion between duplicates. Gene conversion appears to constrain accumulation of deleterious mutations in a way that is identical to crossing over in traditional models of Muller’s ratchet. Under both models, the rate at which the ratchet ‘‘clicks’’—the least mutated class of individuals is lost—is highest when individual mutations are weakly deleterious

284

T. Connallon and A. G. Clark

Figure 4.—Intrapalindrome gene conversion prevents the erosion of Y chromosome gene content and enhances adaptation on the Y. N represents the Y-linked effective size, sh is the fitness cost associated with mutations to one copy of each duplicate pair, t refers to the generation within the simulation, and n is the number of distinct genes on the chromosome (including duplicates, each Y carries 2n genes). Results are presented for c ¼ 0, b ¼ 0.5, and u ¼ 5 3 104, per locus, per generation. Each data point represents the average of 10 simulation replicates. Since estimates of gene conversion from human– chimp comparisons suggest that D may be considerably higher than the mutation rate (Rozen et al. 2003), the results, if anything, will underestimate the impact of gene conversion on functional gene retention.

and/or the chromosome-wide mutation rate (an increasing function of the mutation rate per locus and the number of loci) is high (Charlesworth and Charlesworth 2000; Bachtrog 2008). The similar consequences of gene conversion and crossing over are not surprising: both processes permit chromosomal transitions from more to fewer mutations and this, along with purifying selection, can counteract the steady accumulation of new deleterious mutations within a population. DISCUSSION

Previous theory indicates that selection does not generally favor the invasion of a rare duplicate gene unless there is a direct benefit of carrying an additional gene copy (Clark 1994) or there is recombination between the paralogs (Yong 1998; Otto and Yong 2002; Tanaka and Takahasi 2009). We have shown that gene conversion between duplicates can broaden the parameter conditions favoring the invasion of duplicate genes from low initial frequency. Biased gene conversion, with conversion favoring undamaged over damaged gene copies, can generate positive selection for rare duplicates that do not provide a direct fitness benefit (that is, individuals with two functional copies have fitness equal to those with one). However, the strength of positive selection acting on such duplicates is weak (on the order

of the mutation rate). This result is in agreement with a recent simulation study, which also found that gene conversion does not strongly promote the invasion of new Y-linked duplicates (Marais et al. 2010). The invasion dynamics of rare duplicate genes bear some similarities to models of adaptation within gene families (Walsh 1985; Mano and Innan 2008), which show that gene conversion can enhance the probability that a weakly beneficial allele becomes fixed. In our model, gene conversion alone is unlikely to overpower genetic drift unless Nu ? 1, yet this condition is rarely (if ever) expected to arise within animal populations, particularly with respect to Y-linked loci that have reduced effective size relative to other nuclear genes. Furthermore, there is no biological reason to suspect that gene conversion will necessarily be biased against mutant copies of a particular gene. We therefore expect that Y-linked duplicates will most likely become fixed by genetic drift, unless they directly increase the fitness of those who carry them (for additional discussion of duplicate gene fixation, see Innan and Kondrashov 2010). Likewise, deleterious Y-linked crossover events can generate selection against gene duplicates. This factor will have little impact on the probability of fixation or loss unless the crossover rate is relatively high and direct selection on the duplicate is weak or absent. Y chromosome recombination can exert a profound influence on the retention of functional copies of genes

Gene Conversion and Y Evolution

285

Figure 5.—The proportion of loss-of-function duplicates following 100,000 generations of mutation, selection, and genetic drift. Parameters are described in the Figure 4 legend and throughout the text. Results are presented for c ¼ 0, b ¼ 0.5, u ¼ 5 3 104, per locus, per generation, and D on the order of the mutation rate, D ¼ U ¼ 2nu. Each point represents the average of 10 replicate simulations.

that have already become fixed within the population. Our simulations show that low rates of gene conversion are sufficient to maintain Y-linked genes and counteract degradation via Muller’s ratchet. These results are conservative, as higher rates enhance the preservation of functional gene copies. Thus, once gene conversion has evolved, it can potentially provide a degree of stability on an otherwise evolutionarily unstable Y chromosome. Interestingly, Marais et al. (2010) observed that the rate of invasion for gene conversion modifier alleles does not greatly exceed neutral expectations unless they greatly increase the gene conversion rate. This suggests that, while low rates of conversion may slow the rate of Muller’s ratchet, the evolution of the gene conversion rate itself may be much more restrictive. The large number of genes within the ‘‘ampliconic’’ region of the human Y (Skaletsky et al. 2003) should provide a large target for mutations, creating an opportunity for Muller’s ratchet to act. This role of gene conversion on the Y is therefore likely to explain patterns of gene retention on the human Y chromosome. It is less clear whether similar patterns characterize other animal species. Current (albeit incomplete) data suggest that gene family amplification and retention might be common Y chromosome attributes (Rozen et al. 2003; Verkaar et al. 2004; Murphy et al. 2006; Alfo¨ldi 2008; Wilkerson et al. 2008; Krsticevic et al. 2009), although the prevalence of Y-linked gene conversion outside the human and chimp lineages is less clear (but see Geraldes et al. 2010). Future sequencing efforts, including evidence for gene conversion among Y-linked genes in nonhuman species, will help to determine the general relevance of the duplication and gene conversion model presented here. Within-chromosome crossovers can generate an abnormal, sterility-inducing Y (Lange et al. 2009) and potentially represent a deleterious fitness consequence of Y-linked recombination. This cost also implies that

the number of Y-linked duplicate genes (or in humans the size of Y-linked palindromes) will have an upper limit. As the number of Y-linked loci that interact via recombination increases, so too should the rate of deleterious crossovers. This suggests an upper limit to Y chromosome gene content, where crossing over becomes unbearably costly. From this perspective, duplication and recombination represent a costly mechanism of Y chromosome preservation. In addition to the Y chromosome, our findings have implications for asexually reproducing species. Recent reports suggest that the asexual bdelloid rotifers are tetraploid (Mark Welch et al. 2008) and that gene conversion occurs between gene copies (Hur et al. 2008; Mark Welch et al. 2008). Our model supports the verbal claim that gene conversion between homologous gene copies might aid in DNA damage repair and prevent the genomic degradation that is expected to accompany strict asexual reproduction. Unlike the Y chromosome scenario, crossovers between homologous, tetraploid chromosomes will tend to avoid deleterious chromosomal aberrations. The relative rate of nonhomologous crossovers is an empirical question that may be difficult to assess, given the likely association between chromosome abnormalities and embryonic death, which will lead to a pronounced bias toward ‘‘normal’’ chromosomes. On the other hand, crossing over between homologous chromatids is likely to generate copy number polymorphism, which adds a level of complexity to the evolutionary dynamics of autosomal gene duplicates or gene families. This may lead to different evolutionary consequences of crossing over and gene conversion in asexual lineages compared to the results that we report for the Y chromosome and represents an interesting avenue for future theoretical research. We are grateful to Roman Arguello, Clement Chow, Margarida Cardoso-Moreira, Qixin He, Lacey Knowles, Amanda Larracuente, Rich Meisel, Nadia Singh, and two anonymous reviewers for discussion

286

T. Connallon and A. G. Clark

and comments that substantially improved the quality of the manuscript and to Sarah Otto for comments about the eigenvalue-selectioncoefficient approximation and for sharing an unpublished manuscript. This work was supported by National Institutes of Health grant GM64590 to A.G.C. and A. B. Carvalho.

LITERATURE CITED Alfo¨ldi, J. E., 2008 Sequence of mouse Y chromosome. Ph.D. Dissertation, MIT, Cambridge, MA. Bachtrog, D., 2006 A dynamic view of sex chromosome evolution. Curr. Opin. Genet. Dev. 16: 578–585. Bachtrog, D., 2008 The temporal dynamics of processes underlying Y chromosome degeneration. Genetics 179: 1513–1525. Bengtsson, B. O., 1986 Biased conversion as the primary function of recombination. Genet. Res. 47: 77–80. Bengtsson, B. O., 1990 The effect of biased conversion on the mutation load. Genet. Res. 55: 183–187. Carvalho, A. B., L. B. Koerich and A. G. Clark, 2009 Origin and evolution of Y chromosomes: Drosophila tales. Trends Genet. 25: 270–277. Charlesworth, B., 2003 The organization and evolution of the human Y chromosome. Genome Biol. 4: 226. Charlesworth, B., and D. Charlesworth, 2000 The degeneration of Y chromosomes. Philos. Trans. Biol. Sci. 355: 1563–1572. Clark, A. G., 1994 Invasion and maintenance of a gene duplication. Proc. Natl. Acad. Sci. USA 91: 2950–2954. Engelstadter, J., 2008 Muller’s ratchet and the degeneration of Y chromosomes: a simulation study. Genetics 180: 957–967. Geraldes, J., T. Rambo, R. Wing, N. Ferrand and M. W. Nachman, 2010 Extensive gene conversion drives the concerted evolution of paralogous copies of the SRY gene in European rabbits. Mol. Biol. Evol. (in press). Gordo, I., and B. Charlesworth, 2000 The degeneration of asexual haploid populations and the speed of Muller’s ratchet. Genetics 154: 1379–1387. Haigh, J., 1978 Accumulation of deleterious genes in a population—Muller’s ratchet. Theor. Popul. Biol. 14: 251–267. Haldane, J. B. S., 1937 The effect of variation on fitness. Am. Nat. 71: 337–349. Heinritz, W., D. Kotzot, S. Heinze, A. Kujat, W. J. Eleemann et al., 2005 Molecular and cytogenetic characterization of a nonmosaic isodicentric Y chromosome in a patient with Klinefelter syndrome. Am. J. Med. Genet. A 132A: 198–201. Hughes, J. F., H. Skaletsky, T. Pyntikova, T. A. Graves, S. K. M. van Daalen et al., 2010 Chimpanzee and human Y chromosomes are remarkably divergent in structure and gene content. Nature 463: 536–539. Hur, J. H., K. Van Doninck, M. L. Mandigo and M. Meselson, 2008 Degenerate tetraploidy was established before bdelloid rotifer families diverged. Mol. Biol. Evol. 26: 375–383. Innan, H., and F. A. Kondrashov, 2010 The evolution of gene duplications: classifying and distinguishing between models. Nat. Rev. Genet. 11: 97–108. Kimura, M., 1957 Some problems of stochastic processes in genetics. Ann. Math. Stat. 28: 882–901. Kimura, M., 1962 On the probability of fixation of mutant genes in a population. Genetics 47: 713–719. Kimura, M., and T. Maruyama, 1966 Mutational load with epistatic gene interactions in fitness. Genetics 54: 1337–1351. Koerich, L. B., X. Wang, A. G. Clark and A. B. Carvalho, 2008 Low conservation of gene content in the Drosophila Y chromosome. Nature 456: 949–951. Kondrashov, A. S., 1988 Deleterious mutations and the evolution of sexual reproduction. Nature 336: 435–440. Kondrashov, A. S., and J. F. Crow, 1988 King’s formula for the mutation load with epistasis. Genetics 120: 853–856. Krsticevic, F. J., H. L. Santos, S. Januario, C. G. Schrago and A. B. Carvalho, 2009 Functional copies of the Mst77F gene on the Y chromosome of Drosophila melanogaster. Genetics 184: 295–307.

Lange, J., H. Skaletsky, S. K. M. van Daalen, S. L. Embry, C. M. Korver et al., 2009 Isodicentric Y chromosomes and sex disorders as byproducts of homologous recombination that maintains palindromes. Cell 138: 855–869. Mano, S., and H. Innan, 2008 The evolutionary rate of duplicate genes under concerted evolution. Genetics 180: 493–505. Marais, G., 2003 Biased gene conversion: implications for genome and sex evolution. Trends Genet. 19: 330–338. Marais, G., P. R. A. Campos and I. Gordo, 2010 Can intra-Y gene conversion oppose the degeneration of the human Y chromosome?: A simulation study. Genome Biol. Evol. 2: 347–357. Mark Welch, D. B., J. L. Mark Welch and M. Meselson, 2008 Evidence for degenerate tetraploidy in bdelloid rotifers. Proc. Natl. Acad. Sci. USA 105: 5145–5149. Murphy, W. J., A. J. P. Wilkerson, T. Raudsepp, R. Agarwala, A. A. Schaffer et al., 2006 Novel gene acquisition on carnivore Y chromosomes. PLoS Genet. 2: e43. Noordam, M. J., and S. Repping, 2006 The human Y chromosome: a masculine chromosome. Curr. Opin. Genet. Dev. 16: 225–232. Ohta, T., 1989 The mutational load of a multigene family with uniform members. Genet. Res. 53: 141–145. Orr, H. A., and Y. Kim, 1998 An adaptive hypothesis for the evolution of the Y chromosome. Genetics 150: 1693–1698. Otto, S. P., and D. Bourguet, 1999 Balanced polymorphisms and the evolution of dominance. Am. Nat. 153: 561–574. Otto, S. P., and T. Day, 2007 A Biologist’s Guide to Mathematical Modeling in Ecology and Evolution. Princeton University Press, Princeton, NJ. Otto, S. P., and D. B. Goldstein, 1992 Recombination and the evolution of diploidy. Genetics 131: 745–751. Otto, S. P., and P. Yong, 2002 The evolution of gene duplicates. Homol. Eff. 46: 451–483. Peck, J. R., 1994 A ruby in the rubbish: beneficial mutations, deleterious mutations and the evolution of sex. Genetics 137: 597–606. R Development Core Team, 2005 R: A Language and Environment for Statistical Computing, reference index version 2.2.1. R Foundation for Statistical Computing, Vienna. Repping, S., H. Skaletsky, J. Lange, S. Silber, F. van der Veen et al., 2002 Recombination between palindromes P5 and P1 on the human Y chromosome causes massive deletions and spermatogenic failure. Am. J. Hum. Genet. 71: 906–922. Rice, W. R., 1987 Genetic hitchhiking and the evolution of reduced genetic activity on the Y sex chromosome. Genetics 116: 161–167. Rozen, S., H. Skaletsky, J. Lange, S. Silber, F. van der Veen et al., 2003 Abundant gene conversion between arms of palindromes in human and ape Y chromosomes. Nature 423: 873–876. Skaletsky, H., T. Kuroda-Kawaguchi, P. J. Minx, H. S. Cordum, L. Hillier et al., 2003 The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature 423: 825–837. Tanaka, K. M., and K. R. Takahasi, 2009 Enhanced fixation and preservation of a newly arisen duplicate gene by masking deleterious loss-of-function mutations. Genet. Res. 91: 267–280. Verkaar, E. L. C., C. Zijlstra, E. M. van ’t Veld, K. Boutaga, D. C. J. Boxtel et al., 2004 Organization and concerted evolution of the ampliconic Y-chromosomal TSPY genes from cattle. Genomics 84: 468–474. Walsh, B., 1985 Interaction of selection and biased gene conversion in a multigene family. Proc. Natl. Acad. Sci. USA 82: 153–157. Wilkerson, A. J. P., F. Raudsepp, T. Graves, D. Albracht, W. Warren et al., 2008 Gene discovery and comparative analysis of X-degenerate genes from the domestic cat Y chromosome. Genomics 92: 329–338. Yong, P., 1998 Theoretical population genetic model of the invasion of an initial duplication. Honours Thesis, Department of Zoology, University of British Columbia, Vancouver, BC, Canada. Yu, Y.-H., Y.-W. Lin, J.-F. Yu, W. Schempp and P. H. Yen, 2008 Evolution of the DAZ gene and AZFc region on primate Y chromosomes. BMC Evol. Biol. 8: 96.

Communicating editor: D. Charlesworth

GENETICS Supporting Information http://www.genetics.org/cgi/content/full/genetics.110.116756/DC1

Gene Duplication, Gene Conversion and the Evolution of the Y Chromosome Tim Connallon and Andrew G. Clark

Copyright Ó 2010 by the Genetics Society of America DOI: 10.1534/genetics.110.116756

2 SI

T. Connallon and A. G. Clark

FILE S1 I. Invasion of gene duplicates on Y chromosomes that carry an arbitrary number of linked genes. Y-linked duplicate genes evolve within the genetic background of the entire Y chromosome, which is likely to contain multiple functional genes, particularly during early stages of sex chromosome evolution. To determine the generality of the single gene duplication scenario in the main text, we developed a second model to examine the evolutionary dynamics of rare, Y-linked duplicates on ancestral chromosomes carrying an arbitrary number (n) of single-copy genes. Consider a rare, Y-linked duplicate on Y chromosome carrying n single-copy genes. By duplicating one of the n single-copy genes, the individual has n – 1 single-copy genes and a single duplicated pair. Though expanding the number of loci greatly increases the number of possible genotypes to follow within the population, subsequent calculations can be simplified by making each gene essential. In other words, fitness drops to zero (s = 1) unless each of the n genes has at least one functional copy. Given this simplification, there are four relevant genotypic classes within the population: (i) individuals with n functional singletons and no duplicates, each at frequency xn and with fitness wn = 1 – sh; (ii) those with n + 1 functional genes (n – 1 singleton) at frequency xn1 and with fitness wn1 = 1; (iii) those with n + 1 genes (n – 1 singleton), of which n are functional, at frequency xn0 and with fitness wn0 = 1 – sh; and (4) a class of sterile individuals, at frequency xs and with fitness ws = 1 – s = 0, that either lack a functional copy of an essential gene, or carry an abnormal Y chromosome. In an individual carrying n singletons, the Y chromosome deleterious mutation rate per gamete per generation is U = nu, and the distribution of mutations across gametes is reasonably modeled as a Poisson variable with mean of nu. However, given that the diploid, genomic deleterious mutation rate is unlikely to be much greater than one, and Y chromosomes typically represent a tiny fraction of a genome, the number of new mutations should be close to the Bernoulli distribution: U = nu is probability of one mutation, and 1 – U represents the probability of zero mutations, per generation. For an individual carrying n + 1 total genes, the overall mutational target will be slightly increased, and the Y chromosome mutation rate becomes Udup = U(n + 1)/n, per generation. The presence of gene duplicates introduces an opportunity for gene conversion, which as before, are governed by recombination rate (d), crossover (c), and conversion bias (b) parameters. Following the events order of (i) birth, (ii) selection, (iii) mutation, (iv) recombination, and (v) fertilization, the Y chromosome recursions are:

x n1 '=

x n1[2Ud(1 c)b + (n  U  Un)(1 dc)] x n 0 (1 h)(1 U)d(1 c)b + [x n1 + (x n 0 + x n )(1 h)]n x n1 + (x n 0 + x n )(1 h)

x n 0 '=

2x n1U(1 d) x (1 h)(1 U)(1 d) + n0 [x n1 + (x n 0 + x n )(1 h)]n x n1 + (x n 0 + x n )(1 h)

x n '=

x n (1 h)(1 U) x n1 + (x n 0 + x n )(1 h)

T. Connallon and A. G. Clark

3 SI

x s '= x n1 '+ x n 0 '+ x n ' Stability of the equilibrium xn1 = xn0 = 0,

=

xˆ n = 1 U = 1 xˆ s , and w = (1 U)(1 h)

is governed by the eigenvalue:

2Ud(1 c)b + (n  U  Un)(1 dc) + (1 h)(1 U)(1 d)n + 2(1 h)(1 U)n 2

{2Ud(1 c)b + (n  U  Un)(1 dc) + (1 h)(1 U)(1 d)n}  4(n  U  Un)(1 dc)(1 d)(1 h)(1 U)n 2(1 h)(1 U)n When there is no recombination (d = 0), a rare gene duplicate is favored by selection when sh > U/(n – nU). Substituting for U = nu yields sh > u/(1 – nu). This result differs slightly from the previous model of a duplicate linked to a single essential gene (the former model predicts that a duplicate invades when sh > u/(1 – u)). Multiple Y-linked genes will therefore decrease opportunities for positive selection in favor of new duplicates. When selection is weak (sh  0), recombination can promote selection in favor of the duplicate. For sh = c = 0, the Taylor series approximation around d = 0 gives a leading eigenvalue of:

 =  d=0 +

 d + O(d 2 )  1+ d(2b 1) d d = 0

which is greater than one for b > 0.5, as in the previous model. Numerical simulations of the leading eigenvalue under a broad range of parameter space show that, as before, the opportunity for positive selection for a new duplicate is greater with recombination.

4 SI

T. Connallon and A. G. Clark

II. Invasion Probability of Duplicate Genes with Gene Conversion

FIGURE S1.—The probability of fixation for Y-linked duplicate genes. The red line depicts the analytical approximation from Eq. (2). To facilitate comparison between these results and those of Fig. 2 from the main text, we show the approximation for N = 1000, s = 1, d = 0, and u = 10-5, and present representative simulation results for d > 0 and various combinations of the remaining parameters (c, b). Circles represent the proportion of duplicate genotypes (out of 100,000 replicate simulations for each data point) that eventually become fixed within the population.

T. Connallon and A. G. Clark

5 SI

III. Maintenance of Functional Gene Duplicates

FIGURE S2.—Gene conversion and the maintenance of functionally redundant paralogs. Results are presented for two extremes of selection: gene conversion between paralogs of an essential gene (s = 1) and between paralogs of a nonessential gene (s = 0.001). In each case, gene conversion is unbiased (b = 0.5) and the mutation rate is u = 10-5. Under essentiality and nonessentiality, fitness is maximized when at least one of the paralog copies is functional (i.e., masking of knockout mutations is complete: h = 0). Each point represents the fraction of 100 simulation replicates where both copies are maintained as functional within the population. For each simulation run, the population is initially fixed for two functional Y-linked genes, and then evolves under mutation, recombination, selection, and genetic drift for 100,000 generations.

6 SI

T. Connallon and A. G. Clark

IV. Frequency of the ‘least loaded class’ under biased gene conversion.

FIGURE S3.—Gene conversion increases the frequency of Y chromosomes haplotypes that carry zero deleterious mutations (i.e., the “least-loaded” genotypic class). Results use the same parameters as those of Fig. 3 with n = 50, and with the biased gene conversion parameter (b) permitted to vary.