Genome-wide network model capturing seed germination reveals ...

6 downloads 415 Views 1MB Size Report
Jun 7, 2011 - Genome-wide network model capturing seed germination reveals coordinated regulation of plant cellular phase transitions. George W. Bassela ...
Genome-wide network model capturing seed germination reveals coordinated regulation of plant cellular phase transitions George W. Bassela,b,1, Hui Lanc, Enrico Glaabd, Daniel J. Gibbsa, Tanja Gerjetsa, Natalio Krasnogord, Anthony J. Bonnerc, Michael J. Holdswortha, and Nicholas J. Provartb a Division of Plant and Crop Sciences, School of Biosciences and Centre for Plant Integrative Biology, University of Nottingham, Loughborough LE12 5RD, United Kingdom; bDepartment of Cell and Systems Biology, Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, ON, Canada M5S 3B2; cDepartment of Computer Science, University of Toronto, Toronto, ON, Canada M5S 2E4; and dSchool of Computer Science, University of Nottingham, Nottingham NG8 1BB, United Kingdom

Edited by Maarten Koornneef, Max Planck Institute for Plant Breeding Research, Cologne, Germany, and approved April 11, 2011 (received for review January 20, 2011)

S

eed germination controls the entry of plant species into ecosystems and is the starting point for the majority of global food production (1, 2). Germination in seeds is defined as events occurring between the initial uptake of water (i.e., imbibition) by a dry seed and the emergence of the embryonic root (1). An imbibed viable seed failing to germinate under favorable conditions is dormant. A period of dry storage (i.e., after-ripening) breaks dormancy, as do environmental cues such as chilling. Germination is principally regulated by the hormones gibberellic acid (GA) and abscisic acid (ABA), which promote and inhibit this process, respectively (1, 3). Seed germination represents one of the two major irreversible developmental-phase transitions in plants, the other being the commitment to flower. By using classical genetic screens, genes controlling this transition in seeds have been uncovered (1, 2), yet the means by which they interact to regulate germination remains largely unknown. In this work, this gap in our knowledge is addressed by predicting unique regulators of seed germination through the development of a transcriptional network graph representing genes expressed during this stage of plant development. Statistical approaches have previously been used to extract greater meaning from existing postgenomic datasets (4, 5). Correlation of gene expression (coexpression) is a powerful approach to analyze large datasets, as coexpressed genes have an increased likelihood of being involved in the same biochemical/developmental pathways www.pnas.org/cgi/doi/10.1073/pnas.1100958108

(4, 6, 7). Coexpression analyses in Arabidopsis have used gene expression data from diverse anatomical, developmental, and physiological sources (4, 5). Although such “condition-independent” meta-analyses, such as AraNet, provide useful generalized data, selecting datasets in a “condition-dependent” fashion (4, 8) should more precisely identify gene interactions relevant to a specific biological question at hand. In this study, we used publicly available gene expression data generated exclusively from imbibed mature seeds. We show that this condition-dependent network predicts known and yet-undescribed regulators and genetic interactions between them. We show that the network topology is evolutionarily conserved and uncover a surprising and previously undescribed conservation of genetic pathways regulating multiple cellular phase transitions in plants. Finally we suggest that abiotic stress pathways have been coopted during the evolution of seeds to acquire the trait of seed dormancy. Results and Discussion Generation, Analysis, and Visualization of Condition-Dependent Gene Expression Networks. Publicly available gene expression data from

imbibed Arabidopsis seeds were collated as previously described (9–16) and used for large-scale analysis of gene expression. A total of 138 samples were used, representing three ecotypes and eight mutants sampled under diverse physiological and environmental conditions over a wide temporal range. These data provided a generalized overview of gene expression in imbibed seeds within a physiologically and ecologically relevant context. In each sample, the ability to complete germination depended on developmental state, environmental conditions, or gene mutation, totaling 73 nongerminating and 65 germinating samples (Table S1). We use the term “nongermination” to encompass both experimentally manipulated lack of germination and developmentally controlled dormancy. A total of 1,844 transcripts were identified by Significance Analysis of Microarrays (SAM) (17) as being significantly associated with nongermination (SAM NG gene list), whereas 1,583 transcripts were associated with germination (SAM G; Dataset S1). These SAM lists contain an abundance of previously described regulators of this process, demonstrating the robust output of this analysis. Correlations between gene expression profiles were determined, and agglomerative hierarchical clustering was performed on the unweighted network by using the average linkage between the groups. A dendrogram and a linear gene ordering was pro-

Author contributions: G.W.B., A.J.B., M.J.H., and N.J.P. designed research; G.W.B., H.L., E.G., D.J.G., T.G., and N.K. performed research; G.W.B. and M.J.H. analyzed data; and G.W.B. and M.J.H. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. Freely available online through the PNAS open access option. 1

To whom correspondence should be addressed. E-mail: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1100958108/-/DCSupplemental.

PNAS | June 7, 2011 | vol. 108 | no. 23 | 9709–9714

PLANT BIOLOGY

Seed germination is a complex trait of key ecological and agronomic significance. Few genetic factors regulating germination have been identified, and the means by which their concerted action controls this developmental process remains largely unknown. Using publicly available gene expression data from Arabidopsis thaliana, we generated a condition-dependent network model of global transcriptional interactions (SeedNet) that shows evidence of evolutionary conservation in flowering plants. The topology of the SeedNet graph reflects the biological process, including two state-dependent sets of interactions associated with dormancy or germination. SeedNet highlights interactions between known regulators of this process and predicts the germinationassociated function of uncharacterized hub nodes connected to them with 50% accuracy. An intermediate transition region between the dormancy and germination subdomains is enriched with genes involved in cellular phase transitions. The phase transition regulators SERRATE and EARLY FLOWERING IN SHORT DAYS from this region affect seed germination, indicating that conserved mechanisms control transitions in cell identity in plants. The SeedNet dormancy region is strongly associated with vegetative abiotic stress response genes. These data suggest that seed dormancy, an adaptive trait that arose evolutionarily late, evolved by coopting existing genetic pathways regulating cellular phase transition and abiotic stress. SeedNet is available as a community resource (http:// vseed.nottingham.ac.uk) to aid dissection of this complex trait and gene function in diverse processes.

duced, revealing the presence of three large clusters of interactions (Fig. 1A). Fifty genes from each SAM list were randomly chosen and plotted within the visualized clusters. The SAM NG genes were closely associated with cluster 1, whereas SAM G genes were associated with clusters 2 and 3 (Fig. 1A), demonstrating an association of cluster 1 with nongermination and clusters 2 and 3 with germination. Clusters were not a consequence of anticorrelation, as anticorrelated interactions are distinct from their positive counterparts (Fig. S1 A and B). Covariance analysis (Fig. 1B) showed that state-dependent clustering of gene interactions was not a consequence of preferential expression (Fig. 1B, blue dots) during nongermination or germination, as the contribution of preferential expression to the covariance between genes is very small. The total covariance between genes (Fig. 1B, black dots) is largely caused by other factors (Fig. 1B, red dots), specifically correlations during germination and correlations during dormancy. These clusters are

Fig. 1. Properties and topologies of gene coexpression networks in Arabidopsis seeds. (A) Coexpression matrix showing gene interactions with a correlation coefficient above 0.6 (>0.75 in Fig. S2). Cluster 1 is highlighted by a red box and corresponds to an abundance of 50 randomly chosen SAM NG genes plotted within the matrix (red dots). Blue boxes indicate the location of clusters 2 and 3, which are associated with a high concentration of SAM G genes (blue dots). (B) Plot of the covariance breakdown for 100,000 randomly selected gene pairs. Black dots represent total covariance, blue dots represent covariance caused by preferential expression, and red dots represent covariance caused by other factors. (C) The SeedNet unweighted seed gene coexpression network. Regions 1, 2, and 3 are outlined in yellow and correspond to clusters 1, 2, and 3 in A. WGCNA network using all seed samples (D), only nongerminating seed samples (E), and only germinating seed samples (F). In all graphs, circles represent genes and lines represent significant transcriptional interactions between the genes. SAM NG genes are colored red and SAM G genes colored blue. Gray nodes are not classified by either gene list. All networks are displayed using Cytoscape organic layout.

9710 | www.pnas.org/cgi/doi/10.1073/pnas.1100958108

therefore biologically robust and not a consequence of statistical artifacts. Transcriptional interactions with an absolute Pearson correlation coefficient greater than or equal to 0.75 were selected to generate the coexpression network based on this cutoff closely fitting the power law relative to a range of thresholds examined (18) (Fig. S2). At this level, genes have 56% of variance in common (4), resulting in a network consisting of 8,621 nodes with 502,173 interactions that is scale-free over three orders of magnitude. Whether biological networks should follow the power law remains controversial, yet this metric provides a rational means with which to establish a gene correlation cutoff on a genome-wide scale. A false discovery rate (FDR) computation was applied to the edges of the network, and all were found to be highly significant at this threshold. We termed this unweighted gene network SeedNet. SeedNet was imported into Cytoscape to generate a graph network visualization by using the organic layout method (Fig. 1C). This graph shows two major regions of interactions (regions 1 and 3) that are connected (region 2). These three regions correspond to the three clusters observed in Fig. 1A. SAM NG genes localized to region 1 and SAM G genes to regions 2 and 3 (Fig. 1C). SeedNet is publicly available to query (http://vseed.nottingham.ac.uk) (19). To determine whether the graph topology of the unweighted SeedNet was robust, weighted gene networks were generated using weighted gene correlation network analysis (WGCNA) (20–22). By using this approach, two major connected regions of transcriptional interactions were observed (Fig. 1D) associated with SAM NG or SAM G gene sets. When only nongerminating seed samples were analyzed, a network consisting of a large single cluster of SAM NG genes was produced (Fig. 1E). When only germinating seed samples were analyzed, two discrete networks were observed, one tightly connected and associated with SAM NG genes and the other more loosely associated, containing SAM G genes (Fig. 1F). Many of the highest-order nodes of SeedNet were represented by SAM NG genes (Fig. S1 C and D). Conversely, the frequency of SAM G genes decreases with greater node degree. There is therefore a greater coordination of transcriptional regulation in the nongerminating state than during germination. Taken together, transcriptional interactions in each of the nongerminating and germinating states are distinct, with the nongerminating state showing greater connectivity and transcriptional coordination. This is intriguing, as nongerminating samples were derived from a range of diverse physiological, environmental, and genetic perturbations, suggesting that common transcriptional mechanism(s) are responsible for the inhibition of germination under diverse states. Probing the Topology of SeedNet. Studies examining Arabidopsis seed germination by using whole-genome microarrays have identified sets of genes affected by developmental, hormonal, and environmental factors that influence this process. To better understand the SeedNet graph topology, we examined the distribution of these previously reported gene sets. Genes associated with dormancy (9, 16) were exclusively located within region 1 of the network, whereas those with germination were associated with region 3 (Fig. 2A). Transcripts induced by the dormancy breaking process of after-ripening were associated with region 3 of the network, whereas those down-regulated by afterripening were closely associated with region 1 (Fig. 2B) (13). These observations indicate distinct developmentally regulated zones of gene interactions within subdomains of the network, connecting region 1 with the developmental state of dormancy, and region 3 with germination. Genes up-regulated in seeds by ABA application [that inhibits germination (13)] are predominantly located within region 1 of the network, although many are also present in region 3 (Fig. 2C). Conversely, ABA down-regulated genes are primarily associated with region 3. GA promotes germination, and transcripts up-regulated by GA application (11) were present almost exclusively in region 3 of the network and GA down-regulated genes were primarily located in region 1 (Fig. 2D). χ2 tests showed all these distributions to be highly significant (P < 10−23). These data are consistent with the “hormonal balance hypothesis” describing the Bassel et al.

regulation of germination (13), demonstrating that the topology of the network is robust, reflecting relationships between gene interactions and underlying developmental processes. The effects of individual DELAY OF GERMINATION (DOG) QTLs controlling seed dormancy have been investigated by using transcriptomics (23). Genes whose expression is affected by these QTLs are evenly distributed throughout SeedNet (Fig. S3 A–D), indicating that natural variation has acted on regulators affecting diverse transcriptional pathways within SeedNet. This may reflect the nature of germination as a complex trait, whereby diverse environmental inputs can differentially influence germination potential in natural ecosystems. Gene sets unrelated to germination did not show discrete distributions, such as those modulated by wounding in leaves (P = 0.07; Fig. S3E), indicating that the spatially discrete distribution of germination-related genes within SeedNet is robust with respect to the germination process (24, 25). However, genes up-regulated by drought (Fig. 2E) and other ABA-regulated vegetative abiotic stresses were associated with region 1 of SeedNet (P < 10−20) (24), whereas those associated with non–ABA-related abiotic stresses, such as anoxia (26), were not (P = 0.06; Fig. S3 F–J). Diverse states of seed dormancy are mediated through both ABA and interactions with the environment (9), and region 1 of SeedNet is characterized by both the presence of genes associated with nongermination, and transcripts induced by multiple ABA-mediated abiotic stresses in the vegetative plant. Similarly, within the SAM NG list, the Gene Ontology (GO) category “response to abiotic stress” was significantly overrepresented (27). Seed dormancy may therefore have resulted from the evolutionary acquisition by seeds of pathways associated with ABA-mediated vegetative responses to abiotic stress. This hypothesis is supported by the observation that the ABI3 gene in the moss Physcomitrella patens is required for ABA-mediated desiccation tolerance in this nonseed plant (28). To examine the conservation of SeedNet, we plotted the position of nodes representing candidate orthologues to genes that are differentially expressed in dormant or germinating wheat embryos (Fig. 2F and Dataset S1). Within region 1, 72% of the differentially regulated genes were up-regulated in dormant embryos (P < 10−7), and in region 3, 79% were up-regulated in germinating embryos (P < 10−23). SeedNet topology therefore shows evidence of evolutionary conservation between diverse flowering plant taxa. Discrete Network Modules Represent Distinct Paths of Information Flow. The Molecular Complex Detection (MCODE) algorithm

Fig. 2. Examination of SeedNet topology using known concepts of the regulation of seed germination. (A) Distribution of genes associated with dormancy (green) and germination (blue) (9, 16) (B) Distribution of afterripening up-regulated (purple) and down-regulated (orange) genes (13). (C) Distribution of ABA up-regulated (tawny) and down-regulated (blue) genes (13). (D) Distribution of GA up-regulated (pink) and down-regulated (green) genes (11). (E) Distribution of genes up-regulated by drought (purple) and also on the SAM NG list (green) in Arabidopsis leaves (25). (F) Genes regulated by seed dormancy in wheat embryos. Dormancy up-regulated genes (pink) and germination-associated genes (green). (G) Distribution of genes involved in the developmental and hormonal regulation of seed germination within the 10 most significant modules identified by MCODE within SeedNet.

Bassel et al.

Use of SeedNet to Identify Regulators. Previously defined genetic regulators of germination and dormancy (Dataset S1) (1) are present throughout SeedNet, with an enrichment in region 1 (Fig. 3A). Within this region, many key regulators of germination interact, including the major dormancy QTL DOG1 (23) and genes both positively (i.e., GID1A, GID1C, GA3, AHG3) and negatively (i.e., ABI3, ABI5, ABA1) regulating germination (Fig. 3B). These transcriptional interactions between genes with opposite activities in the regulation of germination may reflect aspects of negative feedback regulation, as has been shown previously in the case of GA biosynthesis (30). This may also indicate the concurrent activity of antagonistically acting signaling factors modulating signals to maintain dormancy or to commit to the completion of germiPNAS | June 7, 2011 | vol. 108 | no. 23 | 9711

PLANT BIOLOGY

(29) defined 136 modules (defined as clusters of significantly interacting genes) within SeedNet. Modules were annotated with the SAM NG/G and other previously published gene lists (Fig. 2 and Dataset S1). These were discretely associated with either dormancy and ABA, or germination and GA, or as containing genes not developmentally or hormonally regulated (Fig. 2G). Within germination/GA-associated modules (e.g., module 2) genes associated with cell expansion (1), including cell wall remodelling enzymes, protein translation, cytoskeleton, and water channels, are present. Within dormancy/ABA-associated modules (e.g., module 1) are an abundance of key regulatory genes that control the capacity of seeds to complete germination. Promoters of genes within these modules have distinct sets of significantly enriched cis-regulatory elements (Table S2), implying their transcriptional coordination by distinct classes of transcription factors.

Fig. 3. Transcriptional control of known regulatory genes, identification and function of regulators, and molecular validation of functional interactions. (A) Distribution of known dormancy and germination regulatory genes in SeedNet. Genes promoting dormancy are colored red and those promoting germination are colored blue. (B) Interactions between previously described and newly identified germination regulatory genes within the dormancy network located within the black box in A. Genes promoting dormancy are red, and those promoting germination are blue. Previously described regulators are labeled with black text and newly identified regulators are in dark blue. Known interactions are indicated by green edges and newly identified interactions (from G and H) by pink edges. Increased node size corresponds to higher degree and edge width to interaction strength. (C) Dose response of newly characterized mutant seeds to exogenous ABA. (D) Same as C using the GA synthesis inhibiting compound paclobutrazol (PAC). (E) Germination of mutant seeds at 10 °C. (F) Germination of agl67 mutant seeds 2 wk after harvest assayed directly (not chilled) or following 48 h of cold treatment at 4 °C to remove dormancy (chilled). (G) Increased stability of ABI3 and ABI5 proteins in the scl14-1 mutant in the presence of 0.5 μM ABA. (H) Increased stability of ABI5 protein in asg2-1 in the presence of 0.5 μM ABA. (I) Interactions predicted by AraNet between the genes in the graph in B. Interactions common to both networks are represented by red colored edges. Data in C–F are means from four independent experiments; error bars show SD.

nation. Known germination-regulating interactions are also captured by SeedNet, including that between ABI3 and ABI5 (31) and between KEG1 and ABI5 (32). We analyzed the phenotypes of mutants of genes with no ascribed function representing nodes of the highest degree (i.e., hubs) within region 1. These genes represent high-confidence regulatory candidates as an abundance of known regulatory factors are present in region 1, whereas, by definition, these hub nodes have the greatest number of transcriptionally coordinated gene partners in seeds (Table S3). The phenotypes of the top eight highest-order genes for which homozygous T-DNA insertions could be isolated were examined (Table S3). We also examined the highest-ranking SAM NG genes, as this list of differentially regulated genes contains an abundance of previously described regulators (Table S4). Nine previously uncharacterized genes affecting the germination of seeds under several different conditions were identified (Fig. 3 C–F and Fig. S4) including SAM NG genes ALTERED SEED GERMINATION1 (ASG1; At2g24100), HON5 DNA-binding (At1g48620), ANAC014 transcription factor (At1g33060), AGAMOUS-LIKE67 (AGL67; At1g77950), myb family transcription factor (ASG4; At1g01520), and SeedNet hub genes ARF-GAP DOMAIN2 (AGD2; At1g60680), SCARECROW-LIKE14 transcription factor (SCL14; At1g07530), transducin WD-40 (ASG2; At5g10940), and a putative transcription regulator (ASG3; At2g44980). The accuracy rates at which regulators of germination were predicted was 22% using genes within the statistically robust SAM NG list and 50% when examining hub genes within region 1 of the network. Uncharacterized hub genes in region 1 of SeedNet represent highconfidence candidate regulators awaiting further examination. The seed-specific (14, 33) MADS-box transcription factor AGL67 is both a highly ranked SAM NG gene and within the top 2% of node degree in the network with 649 connections (Table S4). This gene has both a tightly developmentally regulated transcript and an abundance of coexpressed gene partners. Mutant seeds lacking this gene have decreased seed dormancy (Fig. 3F), indicating it acts as a repressor of seed germination. Given that it is both highly coordinated with many other genes and has a developmentally regulated transcript that decreases unconditionally before germination, this gene may represent a central repressor analogous to the MADS-Box gene FLC in the regulation of flowering time (34). SCL14 acts to inhibit ABA repression of germination, and is connected in SeedNet to ABI3 and ABI5 (Fig. 3B). Both of these proteins remain stable in the presence of ABA in scl14-1 mutant seeds (Fig. 3G). Similarly, the transducin gene ASG2 is connected to ABI5, and ABI5 protein shows enhanced stability in the asg2-1 mutant in the presence of ABA (Fig. 3H). These regulators affect 9712 | www.pnas.org/cgi/doi/10.1073/pnas.1100958108

seed germination by modulating the stability of key regulatory proteins with which they are connected within SeedNet, which therefore is capable of correctly predicting the flux of information through these key regulatory proteins. SeedNet therefore represents a high-confidence template for hypothesis generation by using combinatorial mutants to deduce the concerted action by which multiple genetic factors regulate this complex trait. Analysis of the core germination regulatory network (Fig. 3B) in the condition-independent network AraNet revealed only two shared interactions, between ABI5 and KEG, and between PHYD and PHYE (Fig. 3I). Despite the use of abundant and diverse data types to establish associations between genes, AraNet does not capture the majority of key interactions regulating seed germination. Although the use of more datasets increases statistical power, data from diverse tissue types decrease the likelihood of capturing developmental states or transitions that are not shared by all tissues sampled, and of identifying key interactions unique to these processes, highlighting the importance of the condition-dependent approach taken here. Transition Region of the Network Captures Diverse Phase Transitions.

Transcripts of key regulators of germination associated with region 1 are rapidly down-regulated during germination (Fig. 4A). Region 2 of the network represents a discrete cluster of interactions (Fig. 1B) and connects regions 1 and 3 (Fig. 1C), suggesting these interactions mediate the transition between the nongerminating and germinating states. Intriguingly, statistically significant GO categories within this region include the biological process “vegetative to reproductive phase transition” (Q-value = 0.0046) and the molecular function “RNA metabolism” (Q-value = 1.2 × 10−12). Analysis of gene expression during a time course of germination shows that there are successive peaks in gene expression when analyzing genes from region 1, through region 2, to region 3 (Fig. 4 B and C). This suggests that this region of the graph captures transcriptional interactions associated with the developmental transition from the nongerminating state to that of germination. It also suggests that this transition is mediated by the progressive induction of transcripts during the time course of seed germination (Fig. S5). Two modules are specific to this transition region (module 52 containing 10 genes and module 58 containing 15 genes; Fig. 4C). Both contain genes previously shown to regulate phase transitions in vegetative tissues. Module 52 contains SERRATE (SE) (35) and module 58 contains EMBRYONIC FLOWER1 (EMF1) (36) and EARLY FLOWERING IN SHORT DAYS (EFS) (37). We demonstrated that efs-1 plants produce seeds exhibiting a range of Bassel et al.

previously unidentified phenotypes (Fig. 4D), including viviparous germination, indicating a failure to induce seed dormancy, and smaller seeds than WT, many of which were green at maturity. We also demonstrated that se mutant seeds have an extreme hypersensitivity to exogenous ABA during germination (Fig. 4E). As the ability to enter into dormancy is thought to be an adaptive event that occurred evolutionarily late (38), it is tempting to speculate that genes involved in other phase transitions, such as flowering and juvenile-to-adult transitions, were coopted to regulate the transition from dormancy to germination. EARLY BOLTING IN SHORT DAYS (EBS) (39), FLOWERING LOCUS C (34), FLOWERING LOCUS T (40), and ABI3 (41) were all shown to regulate germination and flowering time, demonstrating a potential regulatory links between these phase transitions. Conclusion We present a network model describing global transcriptional interactions mediating the maintenance of, and transition between, the two discrete developmental states of dormancy and germination in seeds. Both states are associated with key agronomic and ecological traits, and understanding their associated regulatory networks will facilitate research aimed to enhance seed performance in agriculture as well as understanding the ecological regulation of germination in seed banks within the soil. SeedNet accurately defines seed germination regulators. The network also predicts interactions between these regulators, and 1. Holdsworth MJ, Bentsink L, Soppe WJ (2008) Molecular networks regulating Arabidopsis seed maturation, after-ripening, dormancy and germination. New Phytol 179:33–54. 2. Nambara E, et al. (2000) The role of ABI3 and FUS3 loci in Arabidopsis thaliana on phase transition from late embryo development to germination. Dev Biol 220: 412–423.

Bassel et al.

Materials and Methods Network Generation. Publicly available microarray data were compiled and annotated with respect to developmental status as previously described (14). SAM (17) was performed using all samples, and WCGNA (20) using the one-step automatic network construction (21). Gene correlations were calculated by using the Pearson correlation coefficient within MatLab. A cutoff threshold of 0.75 was chosen for SeedNet, as this generates a network most closely following a power-law distribution. Significant gene interactions were imported into Cytoscape version 2.6.3 for visualization and additional analyses. Plant Materials. Arabidopsis seeds were obtained from the Nottingham Arabidopsis Stock Centre (NASC), and germination assays carried out as previously described (42). Protrusion of the radicle through the endosperm was used as the criteria for germination following 2 d stratification and 7 d incubation. Microarray hybridizations of dormant and nondormancy wheat embryos (variety Option) were carried out at NASC using the GeneChip wheat genome array (Affymetrix). SI Materials and Methods includes a detailed description of these procedures. ACKNOWLEDGMENTS. The authors are grateful to J. Marquez and D. Scholefield (University of Nottingham) for technical assistance, J. Foong (University of Toronto) for preliminary investigations, and P. McCourt (University of Toronto), M. Bennett, Z. Wilson, and G. Seymour (University of Nottingham) for helpful comments prior to submission. We thank Brynjar Gretarsson (University of California) for helping to adjust the WiGis visualization framework used for the online SeedNet tool. The ABI5 antibody was provided by Rick Vierstra (University of Wisconsin). ABI3 antibody was provided by Kazumi Nakabayashi (Max Planck Institute for Plant Breeding Research) and Eiji Nambara (University of Toronto). This work was supported by a Natural Sciences and Engineering Research Council (NSERC) Postdoctoral Fellowship and Marie Curie International Incoming Fellowship (to G.W.B.) and Centre for Plant Integrative Biology Grants BB/ D019613/1 (to M.J.H. and N.K.) and D.J.G. was supported by BBG0105951. M.J.H. was supported by Biotechnology and Biological Sciences Research Council (BBSRC) Grants BBG0105951 and BBG02488X1. T.G. was supported by a BBSRC/Defra LINK Grant BBD0073211, including financial support from RAGT Seeds, Limagrain, KWS UK, Elsoms, and Svalöf Weibull. H.L. and A.J.B. were supported by an NSERC Individual Discovery grant. N.J.P. is supported by an NSERC Discovery grant.

3. Bewley JD (1997) Seed germination and dormancy. Plant Cell 9:1055–1066. 4. Usadel B, et al. (2009) Co-expression tools for plant biology: Opportunities for hypothesis generation and caveats. Plant Cell Environ 32:1633–1651. 5. Lee I, Ambaru B, Thakkar P, Marcotte EM, Rhee SY (2010) Rational association of genes with traits using a genome-scale gene network for Arabidopsis thaliana. Nat Biotechnol 28:149–156.

PNAS | June 7, 2011 | vol. 108 | no. 23 | 9713

PLANT BIOLOGY

Fig. 4. The developmental phase transition between nongermination and germination is captured by SeedNet. (A) Heat map shows the relative transcript abundance of key regulators of dormancy and germination during the time course of seed germination. Hours after imbibition are indicated above the map. (B) Heat map of representative genes within different regions of the network labeled with black circles in C and a lowercase letter to the right of the heat map. Scale below heat maps in A and B indicates log2-transformed transcript abundance relative to median expression on a gene-by-gene basis. (C) Network with nodes corresponding to region 2-specific modules 52 (red) and 58 (blue) highlighted. (D) Seed dormancy phenotype of efs. Different seeds harvested from efs mutant plants are shown, and the percentage of seeds with an altered phenotype are indicated below each image, relative to WT (Ler). (Scale bar: 100 μm.) (E) Response of serrate (se) mutants seeds to increasing concentrations of ABA relative to WT (Col). Data in E are means from four independent experiments; error bars show SD.

will provide a high-confidence template for additional hypothesis generation for both new regulators and their functional interactions. SeedNet is available publicly to query (www.vseed.nottingham.ac.uk and www.genemania.org). The accuracy of SeedNet reflects the utility of the condition-dependent approach used, and should serve as a principle to generate more accurate predictive network models in the future (4). Within the core of region 1 of SeedNet, many previously described regulatory factors interact. These factors have both positive and negative activities on the regulation of seed germination. These concurrent antagonistic activities may represent the transcriptional circuit modulating germination promoting and inhibiting signals received by the seed as it continuously senses the surrounding environment in the dormant state. The decision to remain dormant or enter into germination depends on the net activity of germination promoting signals overcoming their inhibiting counterparts. The data presented support this hypothesis, as regulators identified from this region act through known regulatory components to modulate information flow controlling the entry of gene interactions into region 2, and ultimately leading to the germination-associated gene interactions present in region 3. Finally, this work has demonstrated the possible evolutionary cooption of existing transcriptional pathways by seeds to effect dormancy, an adaptive trait that arose late in evolutionary time (38). Both vegetative stress pathways and cellular-phase transition pathways appear as significantly overrepresented in distinct regions of the network, and provide a logical explanation for how seed dormancy arose.

6. Hughes TR, et al. (2000) Functional discovery via a compendium of expression profiles. Cell 102:109–126. 7. Aoki K, Ogata Y, Shibata D (2007) Approaches for extracting practical information from gene co-expression networks in plant biology. Plant Cell Physiol 48:381–390. 8. Saito K, Hirai MY, Yonekura-Sakakibara K (2008) Decoding genes with coexpression networks and metabolomics - ‘majority report by precogs’. Trends Plant Sci 13:36–43. 9. Finch-Savage WE, Cadman CS, Toorop PE, Lynn JR, Hilhorst HW (2007) Seed dormancy release in Arabidopsis Cvi by dry after-ripening, low temperature, nitrate and light shows common quantitative patterns of gene expression directed by environmentally specific sensing. Plant J 51:60–78. 10. Penfield S, Li Y, Gilday AD, Graham S, Graham IA (2006) Arabidopsis ABA INSENSITIVE4 regulates lipid mobilization in the embryo and reveals repression of seed germination by the endosperm. Plant Cell 18:1887–1899. 11. Ogawa M, et al. (2003) Gibberellin biosynthesis and response during Arabidopsis seed germination. Plant Cell 15:1591–1604. 12. Nakabayashi K, Okamoto M, Koshiba T, Kamiya Y, Nambara E (2005) Genome-wide profiling of stored mRNA in Arabidopsis thaliana seed germination: epigenetic and genetic regulation of transcription in seed. Plant J 41:697–709. 13. Carrera E, et al. (2008) Seed after-ripening is a discrete developmental pathway associated with specific gene networks in Arabidopsis. Plant J 53:214–224. 14. Bassel GW, et al. (2008) Elucidating the germination transcriptional program using small molecules. Plant Physiol 147:143–155. 15. Yamauchi Y, et al. (2004) Activation of gibberellin biosynthesis and response pathways by low temperature during imbibition of Arabidopsis thaliana seeds. Plant Cell 16:367–378. 16. Cadman CS, Toorop PE, Hilhorst HW, Finch-Savage WE (2006) Gene expression profiles of Arabidopsis Cvi seeds during dormancy cycling indicate a common underlying dormancy control mechanism. Plant J 46:805–822. 17. Tusher VG, Tibshirani R, Chu G (2001) Significance Analysis of Microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA 98:5116–5121. 18. Barabasi AL, Albert R (1999) Emergence of scaling in random networks. Science 286: 509–512. 19. Gretarsson B, Bostandjiev S, O’Donovan J, Hllerer T (2009) WiGis: A framework for Web-based interactive graph visualizations. International Symposium on Graph Drawing (Springer-Verlag, Berlin), pp 119–134. 20. Zhang B, Horvath S (2005) A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol 4:Article17. 21. Glaab E, Garibaldi JM, Krasnogor N (2009) ArrayMining: A modular Web-application for microarray analysis combining ensemble and consensus methods with cross-study normalization. BMC Bioinformatics 10:358. 22. Fuller TF, et al. (2007) Weighted gene coexpression network analysis strategies applied to mouse weight. Mamm Genome 18:463–472. 23. Bentsink L, et al. (2010) Natural variation for seed dormancy in Arabidopsis is regulated by additive genetic and molecular pathways. Proc Natl Acad Sci USA 107: 4264–4269.

9714 | www.pnas.org/cgi/doi/10.1073/pnas.1100958108

24. Kilian J, et al. (2007) The AtGenExpress global stress expression data set: protocols, evaluation and model data analysis of UV-B light, drought and cold stress responses. Plant J 50:347–363. 25. Kleine T, Kindgren P, Benedict C, Hendrickson L, Strand A (2007) Genome-wide gene expression analysis reveals a critical role for CRYPTOCHROME1 in the response of Arabidopsis to high irradiance. Plant Physiol 144:1391–1406. 26. Mustroph A, et al. (2009) Profiling translatomes of discrete cell populations resolves altered cellular priorities during hypoxia in Arabidopsis. Proc Natl Acad Sci USA 106: 18843–18848. 27. Zhou X, Su Z (2007) EasyGO: Gene Ontology-based annotation and functional enrichment analysis tool for agronomical species. BMC Genomics 8:246. 28. Khandelwal A, et al. (2010) Role of ABA and ABI3 in desiccation tolerance. Science 327:546. 29. Bader GD, Hogue CW (2003) An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4:2. 30. Yamaguchi S, Kamiya Y (2000) Gibberellin biosynthesis: Its regulation by endogenous and environmental signals. Plant Cell Physiol 41:251–257. 31. Nakamura S, Lynch TJ, Finkelstein RR (2001) Physical interactions between ABA response loci of Arabidopsis. Plant J 26:627–635. 32. Stone SL, Williams LA, Farmer LM, Vierstra RD, Callis J (2006) KEEP ON GOING, a RING E3 ligase essential for Arabidopsis growth and development, is involved in abscisic acid signaling. Plant Cell 18:3415–3428. 33. Toufighi K, Brady SM, Austin R, Ly E, Provart NJ (2005) The Botany Array Resource: eNortherns, Expression Angling, and promoter analyses. Plant J 43:153–163. 34. Chiang GC, Barua D, Kramer EM, Amasino RM, Donohue K (2009) Major flowering time gene, flowering locus C, regulates seed germination in Arabidopsis thaliana. Proc Natl Acad Sci USA 106:11661–11666. 35. Yang L, Liu Z, Lu F, Dong A, Huang H (2006) SERRATE is a novel nuclear regulator in primary microRNA processing in Arabidopsis. Plant J 47:841–850. 36. Moon YH, et al. (2003) EMF genes maintain vegetative development by repressing the flower program in Arabidopsis. Plant Cell 15:681–693. 37. Soppe WJ, Bentsink L, Koornneef M (1999) The early-flowering mutant efs is involved in the autonomous promotion pathway of Arabidopsis thaliana. Development 126: 4763–4770. 38. Linkies A, Graeber K, Knight C, Leubner-Metzger G (2010) The evolution of seeds. New Phytol 186:817–831. 39. Gómez-Mena C, et al. (2001) early bolting in short days: An Arabidopsis mutation that causes early flowering and partially suppresses the floral phenotype of leafy. Plant Cell 13:1011–1024. 40. Strasser B, Sánchez-Lamas M, Yanovsky MJ, Casal JJ, Cerdán PD (2010) Arabidopsis thaliana life without phytochromes. Proc Natl Acad Sci USA 107:4776–4781. 41. Kurup S, Jones HD, Holdsworth MJ (2000) Interactions of the developmental regulator ABI3 with proteins identified from developing Arabidopsis seeds. Plant J 21: 143–155. 42. Holman TJ, et al. (2009) The N-end rule pathway promotes seed germination and establishment through removal of ABA sensitivity in Arabidopsis. Proc Natl Acad Sci USA 106:4549–4554.

Bassel et al.