STAIRS - Wiley Online Library

14 downloads 12916 Views 211KB Size Report
*For correspondence (fax +44 0121 414 5925; email [email protected]).²These authors .... A total of 10 lines (Col, CSS-3 and eight SRLs) out of the 14.
The Plant Journal (2002) 31(3), 355±364

TECHNICAL ADVANCE

STAIRS: a new genetic resource for functional genomic studies of Arabidopsis Rachil Koumproglou1, Tim M. Wilkes1, Paul Townson1, Xiao Y Wang1, Jim Beynon2, Harpal S. Pooni1, H. John Newbury1 and Mike J. Kearsey1,* 1 School of Biosciences, The University of Birmingham, Birmingham B15 2TT, UK 2 HRI, Wellesbourne, Warwick CV35 9EF, UK Received 20 February 2002; revised 4 March 2002; accepted 5 April 2002. * For correspondence (fax +44 0121 414 5925; email [email protected]).²These authors contributed equally to this work.

Summary Many biologically and economically important traits in plants and animals are quantitative/multifactorial, being controlled by several quantitative trait loci (QTL). QTL are dif®cult to locate accurately by conventional methods using molecular markers in segregating populations, particularly for traits of low heritability or for QTL with small effects. In order to resolve this, large (often unrealistically large) populations are required. In this paper we present an alternative approach using a specially developed resource of lines that facilitate QTL location ®rst to a particular chromosome, then to successively smaller regions within a chromosome (< 0.5 cM) by means of simple comparisons among a few lines. This resource consists of `Stepped Aligned Inbred Recombinant Strains' (STAIRS) plus single whole Chromosome Substitution Strains (CSSs). We explain the analytical power of STAIRS and illustrate their construction and use with Arabidopsis thaliana, although the principles could be applied to many organisms. We were able to locate ¯owering QTL at the top of chromosome 3 known to contain several potential candidate genes. Keywords: chromosome substitutions, ®ne mapping, ¯owering time, NILs, QTL location

Introduction Our knowledge and understanding of quantitative, multifactorial traits in plants has until recently been based largely on biometrical analyses of the covariances in phenotype among relatives. These approaches provide no clear idea of the number, location and action of the individual underlying genes, although they give essential information upon which to plan ef®cient breeding programs (Falconer and Mackay, 1996; Kearsey and Pooni, 1996; Lynch and Walsh, 1998). The discovery and development of large numbers of cheap and easily scored molecular markers over the last 15 years has opened the possibility of locating the individual underlying genes, investigating their mode of action (i.e. structural or regulatory), identifying the nature of the allelic variation and exploring interactions among QTL and with the environment (Guo and Lange, 2000; Kearsey and Farquhar, 1998; Tanksley, 1993). Obtaining such information about a wide ã 2002 Blackwell Science Ltd

range of QTL will at last provide us with a knowledge base, the lack of which has dogged the subject of quantitative genetics for so long. Such knowledge is particularly important because it relates to naturally occurring allelic variation that has stood the test of natural selection. The main problem in studying QTL is that of locating them with suf®cient accuracy to identify particular candidates and to clone the genes. This is due to the relatively small effect of the allelic differences compared to the large variation caused by other loci and the environment. Con®dence intervals for QTL in experimental populations, such as F2s and recombinant inbred lines, are seldom less than 5 cM, and often 30±50 cM (Darvasi et al., 1993; Hyne et al., 1995; Tanksley, 1993; Van Ooijen, 1992). Given that an average chromosome is about 100 cM long, a 5-cM interval in Arabidopsis could include 250 genes on average and orders of magnitude more in regions of low crossing355

356 R. Koumproglou et al. Figure 1. (a) The chromosome constitution of the recurrent parent (recipient) and donor lines, together with the ®ve, homozygous, whole Chromosome Substitution Strains (CSS1-5) in Arabidopsis thaliana. (b) A set of n, single recombinant lines (SSRL 1-n), making up the Stepped Aligned Inbred Recombinant Strains (STAIRS), showing the increasing length of donor segment in a single chromosome.

over. Greater precision in segregating populations can only be obtained by much larger population sizes that are often dif®cult to achieve in practice. One solution to this problem is to follow the initial, approximate QTL locations, based on segregating populations, with a ®ner analysis using near isogenic lines (NILs). These can be derived from recombinant inbred lines (RILs) at an advanced state of inbreeding or backcrossing by identifying individuals that are homozygous for all regions of every chromosome except for one or two short tracts. Sel®ng such individuals produces the two alternative homozygotes in the next generation that are selected and maintained. Any genetic difference between such pairs of lines can be immediately ascribed to the differential tract(s) and, furthermore, the location of a QTL in such segments can be further re®ned by backcrossing. NILs have the advantages of lacking additional genetic variation and allowing extensive experimental replication in order to improve the statistical power (El-Assal et al., 2001). However, it may be dif®cult to ®nd NILs in a particular chromosomal region of interest and a large number of pairs are necessary to cover the whole genome. An alternative approach is to produce introgression lines by backcrossing with marker assisted selection. This provides lines with a common recurrent genotype but different, short, donor segments from another line (Eshed and Zamir, 1995; Howell et al., 1996; Ramsay et al., 1996; Zamir, 2001). In order to complement NILS and introgression lines, provide a directed approach to QTL location and improve the power of resolution, we have developed two associated resources in Arabidopsis thaliana, using the accessions Columbia (Col), Landsberg (Ler) and Niederzenz (Nd). These resources, have several advantages over conventional NILs, not least the ability to focus precisely on any region of the genome. They also facilitate the genetic and functional genomic analysis of QTL. The ®rst resource consists of the ®ve homozygous Chromosome Substitution Strains (CSS1-5), in each of which a different Col chromosome has been replaced by

Ler or Nd (Figure 1a). These strains allow genetic differences between Col, Ler and Nd to be assigned easily to particular chromosomes, providing that QTL linked in repulsion do not cancel out. Similar CSSs have been extensively used in wheat (see review by Law et al., 1983) and are being produced in mice (Nadeau et al., 2000). The second and most useful resource is derived from each CSS and progresses the gene location further. It consists of a large number of lines, each of which contains a homozygous chromosome with a single crossover, such that the chromosome contains Col genes at one end and Ler genes at the other. These homozygous lines we call single recombinant lines (SRLs). When the SRLs for each chromosome are stacked sequentially, they show a steplike progression with each successive line having a little more Ler chromosome. Such a set of SRLs for a particular chromosome, are called STAIRS (Stepped Aligned Inbred Recombinant Strains) to re¯ect their structural relationship (Figure 1b). They allow genetic differences between Col and Ler to be assigned to particular regions within a chromosome. Initially STAIRS with `wide' steps allow gene location to wide regions of 5±10 cMs, but `narrower' steps within that region then allow one to focus in to less than 1 cM. Any set of STAIRS exists in two reciprocal forms depending on whether the donor chromosome extends from the top or the bottom of the chromosome. A major advantage of STAIRS is that one can make comparisons of the phenotypes of pairs or sets of lines in which the only genetic differences lie in short de®ned regions of one selected chromosome. The focus on such distinctive pairs of lines has particular value in the analysis of QTL, for candidate gene searching and for gene expression studies. The main aims of this paper are to describe the application and use of STAIRS and CSSs, to explain the theory and practice behind their construction and to illustrate their potential for genetic analysis. The latter is demonstrated using preliminary data from `wide' STAIRS, to locate a major gene and QTL for ¯owering time, number of rosette leaves and plant height. ã Blackwell Science Ltd, The Plant Journal, (2002), 31, 355±364

Genetic analysis by STAIRS 357 Results from trial with wide STAIRS of Chromosome 3

Figure 2. Breeding programme to produce single whole Chromosome Substitution Strains (CSSs) (C) and Stepped Single Recombinant Lines (SSRLs) (E). Individual Bc1a individuals containing single, intact whole chromosomes (only chromosome 1 is illustrated) are selected (B). These are both selfed to produce the corresponding true breeding CSS (C) and backcrossed to the recurrent parent to generate recombinants (D). Individual Bc2 individuals (D) are in turn selfed to generate the corresponding true breeding SSRLs (E) which constitute the STAIRS. See text for further explanation.

Results and discussion Resources produced Chromosome substitution strains have been produced for chromosomes 1 through 5 for the Ler into Col substitutions and for chromosomes 3 and 4 for Nd into Col (See Figure 2 for procedures). Backcross seed for constructing STAIRS of all these chromosomes are available and some 30 SRLs have been produced and are already in trials. We plan to have approximately 200 backcross individuals, with single recombinant chromosomes, genotyped for each chromosome shortly. Their selfed seed (from which particular SRLs can be extracted) and DNA will be made available through NASC. ã Blackwell Science Ltd, The Plant Journal, (2002), 31, 355±364

A total of 10 lines (Col, CSS-3 and eight SRLs) out of the 14 possible with the markers used (see Figure 3), were scored in this trial. Some SRLs had replicate lines extracted from different backcross individuals. The means of these lines for rosette leaf number at 30 days (Rln30), ¯owering time (Ft), height at 35 days (Ht35) and presence of trichomes are shown in Table 1. To illustrate the analysis of these data, the ANOVA (mixed cross-classi®ed and nested) for Rln-30 is shown in Table 2. Because there were missing plants in some families the analysis was performed using the GLIM procedure and because both blocks and lines within SRLs were statistically `random' effects, the error terms for the various items had to be synthesised from the other MSs (Snedecor and Cochran, 1980). The source and value of the `Error' MS for the critical `Among Lines' item is given in Table 2 for completeness. There were highly signi®cant differences between Lines (P < 0.001) and also between replicate lines within SRLs. The latter indicates the necessity in QTL studies to replicate families using different, yet genetically identical, mothers because of the presence of common environment, maternal, effects. In the absence of replication at this level, random maternal effects would be confused with genetic effects. The genetic model for the effects of each segment (see Figure 3) is shown in Table 1. Because not all possible SRLs were included, the parameters a1 and a2 are confounded and hence are not separable. The observed means and weights of the 10 lines for Rln30, are shown in Table 3(i). The estimates of the only two signi®cant parameters, m and [a1 + a2], and chi-square test of goodness of ®t are shown in Table 3(ii), while the resulting expected means are shown in Table 3(i). The non-signi®cant chi-square shows that the 10 means are adequately explained by m and [a1 + a2] and that there is no evidence of any other QTL on chromosome 3 for this trait. The same situation was found for the other two quantitative traits and their effects are also given in Table 3(ii). Therefore, a single highly signi®cant QTL (a1 + a2) at the top of chromosome 3 is suf®cient to explain the variation between the 10 SRLs for all three traits. Substituting the Col with a Ler allele at this QTL increases the Rln30 by approximately 3.8 leaves, delays ¯owering by approximately 3.8 days and decreases Ht30 by approximately 11.6 cm. Because SRLs 7 and 10 were not available at the time of this trial, we can only conclude that this QTL is located between 0 and 44 cM and possibly in the region 0± 20 cM Jansen et al. (1995), using MQM mapping on the ColxLer RILs, also reported a QTL in this region affecting leaf number. Possible candidate genes on chromosome 3 known to affect ¯owering time in Arabidopsis are almost exclusively clustered in the ®rst 20 cM. These include HST

358 R. Koumproglou et al. Figure 3. Ideogram of STAIRS for chromosome 3, indicating Col and Ler regions, line designations and number of replicate lines in (n). Vertical lines indicate interstitial marker positions; potential QTL locations are indicated by `ai' at the base of the ®gure. j = Ler; o = Col; q = Cross-over region.

Table 1 Line means and coef®cients of the genetic model for the reciprocal Col/Ler STAIRS for chromosome 3 (See Figure 3 for explanation of lines and model). For each parameter in the model, 6 1 refers to an allele from Ler or Col, respectively. Rln30 = Rosette leaf number at 30 days, Ft = ¯owering time (days from sowing), Ht35 = height at 35 days, Trich = presence/absence of trichomes Traits

Model

Line

Rln30

Ft

Ht35

Trich.

m

a1

a2

a3

a4

a5

a6

a7

Col SRL-1 SRL-2 SRL-3 SRL-4 SRL-5 CSS-3 SRL-8 SRL-9 SRL-10

12.11 12.46 14.19 12.78 12.09 13.20 16.26 16.00 16.42 16.92

27.10 28.08 28.63 27.44 26.22 27.27 30.00 31.80 30.18 30.81

22.18 16.26 14.04 19.70 23.41 19.41 8.34 5.67 9.28 8.29

0 0 0 0 1 1 1 0 1 1

1 1 1 1 1 1 1 1 1 1

-1 -1 -1 -1 -1 -1 1 1 1 1

-1 -1 -1 -1 -1 -1 1 1 1 1

-1 -1 -1 -1 -1 1 1 -1 1 1

-1 -1 -1 -1 1 1 1 -1 -1 1

-1 -1 -1 1 1 1 1 -1 -1 -1

-1 -1 1 1 1 1 1 -1 -1 -1

-1 1 1 1 1 1 1 -1 -1 -1

(5 cM), ENL (approximately 10 cM), FLD (14 cM), PEF1 (16 cM) and ESD1 and VRN1 (both tentatively located at the top of chromosome 3). Trichomes occur in SRLs 4, 5, 9, 10 and CSS-3 but not in the others (Table 1). Inspection of the model in

Table 1 and the pattern of overlap of the SRLs in Figure 3 indicates that the gene responsible lies close to the marker (AthGAPAb) at 44 cM which is consistent with the known location of the major gene responsible, glabra1, at 47 cM. ã Blackwell Science Ltd, The Plant Journal, (2002), 31, 355±364

Genetic analysis by STAIRS 359 Table 2.

ANOVA

by General Linear Model for rosette leaf number at 30 days

Source

d.f.

MS

F

P

1 2 3 4 5 6

9 2 18 6 12 407

157.982 89.074 9.082 15.564 1.655 4.608

7.13* 11.54 4.99 3.38 0.36

< < <