05) Feng - Maydica

2 downloads 0 Views 1MB Size Report
COOPER M., O.S. SMITH, G. GRAHAM, L. ARTHUR, L. FENG, D.W.. PODLICH, 2004 Genomics, genetics, and plant breeding: a private sector perspective.
Maydica 51 (2006): 293-300

TEMPORAL TRENDS IN SSR ALLELE FREQUENCIES ASSOCIATED WITH LONG-TERM SELECTION FOR YIELD OF MAIZE1 L. Feng, S. Sebastian, S. Smith, M. Cooper* Pioneer Hi-Bred International Inc., 7250 N.W. 62nd Avenue, P.O. Box 552, Johnston. Iowa 50131, USA Received July 4, 2005

ABSTRACT - In association with improved grain yield and agronomic performance of maize hybrids developed by Pioneer Hi-Bred for the central U.S. corn-belt, there have been detectable changes in the organization of genetic diversity. Genetic diversity was measured using 1698 Simple Sequence Repeat (SSR) markers distributed over the 10 chromosomes of maize. A subset comprising of 361 SSRs was used to assay the 88 inbred parents of a sequence of 53 successful maize hybrids released by Pioneer Hi-Bred from 1934 to 2004. Patterns of change in SSR allele frequencies over the history of the breeding program are complex and indicative of a large open breeding system. Many of the temporal trends for individual SSR alleles over the history of the breeding program are unique and likely result from the combined influences of many factors that operated during the history of the breeding program. Both random sampling of alleles and selection within the pedigree relationships of the germplasm, from the founding ancestors to the modern inbreds, can explain some of the trends in genetic diversity. In addition, new germplasm with unique genetic diversity was introduced into the breeding effort at different times. Selection has been applied to organize the germplasm into heterotic groups with unique allele combinations at the SSR loci and to increase the frequencies of some alleles within the heterotic groups. The modern commercial maize hybrids of today combine the diversity between the heterotic groups and enable the deployment of this diversity within the U.S. corn-belt. KEY WORDS: Zea mays L.; Maize; Simple Sequence Repeat (SSR); Allele frequency; Putative selection region (PSR); Long-term selection; Genetic diversity; Heterotic groups.

1 Contribution to Maydica volume honoring contributions of Dr Donald Duvick.

* For correspondence (fax [email protected]).

+1

515

3346634;

e.mail:

INTRODUCTION Successful plant breeding relies on a balanced effort to achieve long-term goals for improvement of a germplasm base and short-term goals to develop cultivars that meet the environmental challenges of the target production system. The temporal sequence of successful cultivars from a breeding program can be used to obtain some understanding of the genetic and phenotypic changes that are associated with a long-term breeding program. Understanding the genetic changes brought about by breeding provides a basis for managing genetic diversity within a breeding program and its deployment across a target population of environments. Both issues are central to sustainable breeding efforts and sustainable agricultural systems. Long-term temporal trends in trait performance, germplasm contributions, genetic diversity and allelic changes at the DNA sequence level associated with breeding have been the subject of study for maize (Zea mays L.) and other crops (SEBASTIAN et al., 1995; HANAFEY et al., 1998; DUVICK et al., 2004a,b; JANICK, 2004a,b; O’NEILL et al., 2004; LE CLERC et al., 2005). Duvick has extensively reported on observations obtained for a sequence of successful maize hybrids that were developed by Pioneer Hi-Bred International for increased yield and agronomic performance in the central region of the U.S. corn-belt over the period from the 1930s to the present (DUVICK 1977, 1984, 1992, 1997; DUVICK et al., 2004a,b). These hybrids are referred to here as the Era hybrids and were defined as successful based on their wide adoption in the target region following their commercial release. SMITH et al. (2004) have described the germplasm dynamics that contributed to the pedigree relationships of this longterm maize breeding effort. They documented the landrace and founder contributions to the sequence

294

L. FENG, S. SEBASTIAN, S. SMITH, M. COOPER

of hybrids and their patterns of use over the decades. With the availability of a large number of molecular markers for maize it is now possible to investigate long-term trends in the frequencies of alternative alleles for polymorphic regions of the genome (SEBASTIAN et al., 1995; HANAFEY et al., 1998). Using Simple Sequence Repeat (SSR; microsatellite) markers some preliminary results for the Pioneer Era maize hybrids have been reported (COOPER et al., 2004; DUVICK et al., 2004a,b). Here we report an update on these earlier studies based on a larger number of SSRs and an extended set of the Era hybrids and give some interpretations of significant trends observed.

MATERIALS AND METHODS Maize inbreds Eighty-eight maize inbreds were used in this study. The inbreds are the parents for a set of 53 widely used U.S. central corn-belt hybrids released by Pioneer Hi-Bred from 1934 to 2004 (Table 1). The Era Hybrid Study has been conducted from 1972 with new “best seller hybrids” added each year so the entry list of this set of hybrids undergoes annual revision. The 88 inbreds considered in this study were based upon the entry list for the 2004 experiments. Further information on the hybrids can be found in DUVICK (1977, 1984, 1992, 1997), DUVICK et al. (2004a,b), and SMITH et al. (2004). SSR loci All SSR analyses were performed by the molecular lab of Pioneer Hi-Bred International (IA, USA). A preliminary evaluation of SSR diversity was conducted using 1698 SSR loci. From this set a subset of 361 SSR loci, with individually high discrimination ability across maize germplasm, less than 10% missing data per SSR, and collectively affording comprehensive genomic coverage, was selected for use in this study. The 361 loci were distributed across the 10 chromosomes of maize; for Chromosomes 1 to 10 the numbers of loci were 66, 45, 40, 40, 33, 26, 27, 32, 22, and 25, respectively, plus five unmapped loci. DNA samples were extracted from the 88 inbreds using a CTAB procedure (SAGHAI-MAROOF et al., 1984). Equal amounts of leaf tissue from each of eight plants were bulked to sample each inbred. Statistical analysis The number of different alleles among the 88 inbreds was counted for each SSR locus. The Polymorphism Information Content (PIC i = 1 - Σ f i2 , where fi is the frequency of the ith allele at a locus) was estimated for each SSR over the whole data set and also by decade and used as a measure of genetic diversity (WEIR, 1996). The number of alleles per locus in common between pairs of inbreds was used to compute a measure of genetic distance between all pairs of the 88 inbreds based on the equation given by NEI and LI (1979): Dij = 1 – (2 Nij / Tij)

TABLE 1 - Maize hybrids listed according to their initial year of commercial release and type of cross (DX = double cross, TX = triple cross, MSX = modified single cross, SX = single cross). –––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––

Hybrid

Decade

Year

Type of cross

–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––

351 322 307 317 330 336 340 339 344 352 350B 347 301B 354 329 354A 328 3618 3206 3306 3376 3390 3571 3334 3388 3517 3366 3301A 3529 3541 3382 3377 3378 3475 3379 3362 3417 3394 3563 3489 3335 33A14 34G81 33G26 34B23 33B51 33P67 34M95 34H31 34H32 34N42 33A84 33N09

1930s

1940s

1950s

1960s

1970s

1980s

1990s

2000s

1934 1936 1936 1937 1939 1940 1941 1942 1945 1946 1948 1950 1952 1953 1954 1958 1959 1961 1962 1963 1965 1967 1968 1969 1970 1971 1972 1974 1975 1975 1976 1982 1983 1984 1988 1989 1990 1991 1991 1994 1995 1997 1997 1998 1999 1999 1999 2001 2002 2003 2003 2004 2004

TX DX DX DX DX DX DX DX DX DX DX DX DX DX DX DX DX DX DX SX SX MSX MSX SX MSX MSX SX SX MSX SX SX SX SX SX SX SX SX SX SX SX SX SX SX SX SX SX SX SX SX SX SX SX SX

–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––

CHANGES IN ALLELE FREQUENCIES BY SELECTION

where Nij is the number of common bands for both inbreds i and j, and Tij is the total number of bands for both inbreds i and j. Thus, the Nei-Li measure quantifies the proportion of the measured marker alleles that are different between two inbreds being compared. The Nei-Li distance matrix was used to conduct principal components analysis (PCA; SAS PRINCOM procedure; SAS, 1999) of the 88 inbreds to quantify and visualize the similarities and differences among the inbreds. To further examine changes in the frequencies of alleles over time, the inbred parents of the hybrids were grouped into decade and double-decade periods based on the year of release of the respective hybrid (Table 1). The frequencies of alleles were compared for different periods representing the temporal sequence of the breeding program. Of particular interest were comparisons between allele frequency changes from early to late periods that represented important changes in the germplasm used in the breeding program (SEBASTIAN et al., 1995; HANAFEY et al., 1998; DUVICK et al., 2004a,b; SMITH et al., 2004). Simulation Following observation, from the PCA analysis and comparison of allele frequencies between early-to-late decades, that there were changes in the frequencies of alleles of the SSR loci over the history of the breeding program, it was of interest to determine which, if any, of the observed changes were large relative to the expected changes based on the pedigree history of the inbreds in the absence of any within cross selection. Analyses that focused on this aspect of the temporal changes observed for the SSR alleles were intended to identify regions of the genome that were putative regions under the influence of directional selection for alternative alleles at some point in the history of the breeding program (SEBASTIAN et al., 1995; HANAFEY et al., 1998). Preliminary indications of such putative selection regions (PSRs) were obtained by simulating the pedigree relationships from founders to inbreds across early-to-late periods of the Era sequence. The simulations provided sets of expected distributions of allele frequency changes based on pedigree relationships in the absence of any directional selection within crosses. The observed allele frequencies for the respective early-to-late period contrasts were compared to the simulated expectations to obtain an indication of the likelihood of the observed changes based on the pedigree history in the absence of selection. Simulations were conducted using the QU-GENE software (PODLICH and COOPER, 1998). Here we consider the simulation of the pedigree history for the subset of 17 hybrids that had their commercial release in the period from 1990 to 2004 (Table 1). The pedigrees of the parents of the hybrids were traced back to the earliest and most distantly related ancestral inbreds for each of the female parents (referred to as the Stiff Stalk population, SS) and the male parents (referred to as the Non-Stiff Stalk population, NSS). Those ancestral inbreds were then designated as founder members for the SS and NSS populations, respectively. Allele haplotype profiles using the 356 mapped SSR markers were constructed for all founder and hybrid parent inbreds and used in the simulation study. The expected transfer of the allele haplotype of the founder inbreds to the inbred parents of the modern Era hybrids was traced for all steps in each of the pedigree relationships from founder to modern inbreds assuming random sampling of the founder alleles at all steps in the pedigree. Ten thousand simulations of the transfer of the founder allele haplotypes to the modern inbreds were conducted. The expected distribution of allele frequencies in the modern inbreds, given

295

their starting frequency and distribution in the founders, was displayed graphically by defining boundaries for different percentages (80%, 95%, 99% and 100%) of the total 10,000 simulations resulting in a shift in allele frequency from the founder to modern inbreds. The observed frequencies of SSR alleles in the founder and modern inbred sets were then superimposed on the expected distributions to determine whether any of the observed allele changes were greater than could be expected on the basis of random sampling of alleles between steps within the pedigree structures. Any of the observed allele shifts that occurred outside the upper and lower boundaries set by 95% of the 10,000 simulations were identified as PSRs. The analyses were conducted separately for the SS and NSS parents of the modern hybrids.

RESULTS Allele shifts from the 1930-40s to the 1990-2000s For the complete set of 88 inbreds and 1698 SSRs there were 6242 distinct alleles identified. The average number of alleles per SSR locus was 4.0 (range = 1 to 20) and average PIC score of 0.50 (range = 0 to 0.9). Changes in the frequencies of the individual alleles from early periods (e.g., 1930-40s) to more recent periods (e.g., 1990-2000s) were investigated (Fig. 1). Many of the individual alleles showed unique temporal patterns over decades (see NIEBUR et al., 2004, for some examples). There was a large group of alleles (1457) that were present in the old inbreds from the 1930-40s that were absent in the inbreds of the 1990-2000s. Similarly, there was another large group of alleles (1858) that were present in the inbreds of the 1990-2000s but were absent in the inbreds of the 1930-1940s. The largest group of alleles (2927) represents those alleles that were found in both the old and newer inbreds. Within the set of 1457 alleles that were present in the old inbreds from the 1930-40s but absent in the inbreds from the 1990-2000s, the majority of these alleles were observed at low frequencies in the old inbreds (Fig. 2a). However in contrast, for the 1858 alleles present in the inbreds from the 1990-2000s that were not present in the inbreds from the 1930-1940s, the frequencies of these alleles in the modern inbreds were more uniformly distributed (Fig. 2b), suggesting new alleles had been introduced and deployed in the inbreds of the modern Era hybrids. Within the group of 2927 alleles observed in both the old and new inbreds, 16 alleles were fixed (i.e., at 100%) at their respective SSRs in both the old and new inbred sets (Fig. 1). These alleles may indicate regions of the genome (and their associated genes and traits) that have been maintained in the

296

L. FENG, S. SEBASTIAN, S. SMITH, M. COOPER

term view of the Pioneer maize breeding effort is of a large open breeding program with some key founder sources that have been important throughout the history of the breeding effort and other germplasm sources that have come and gone and still others that are being continually introduced. For the subset of 361 SSR loci selected for further study there were 2278 distinct alleles, with an average of 6.3 alleles per locus and an average PIC score of 0.61 (Fig. 3). The number of alleles and PIC score per SSR for these 361 SSR loci were summarized by decade (Table 2). There was a general trend towards a lower number of alleles per locus and lower PIC score from early to later decades. The temporal trends in patterns of change in allele profiles across the decades were investigated further by PCA and simulation modeling of the pedigree history leading to the creation of the modern inbreds.

Allele Frequency in New (1990s and 2000s) Inbreds (30)

Old=0 & New>0 (1858) Old>0 & New=0 (1457) Old=1 & New=1 (16) Old=1 & New>0 (30) Old>0 & New=1 (35) 0