Lineage-specific chromatin signatures reveal a

0 downloads 0 Views 4MB Size Report
Jul 27, 2015 - in red and green, for nitrogen (N−) starvation and sulphur (S−) starvation, ... Confocal microscopy of LipidTOX Green-stained cells, chlorophyll ...
ARTICLES PUBLISHED: 27 JULY 2015 | ARTICLE NUMBER: 15107 | DOI: 10.1038/NPLANTS.2015.107

Lineage-specific chromatin signatures reveal a regulator of lipid metabolism in microalgae Chew Yee Ngan1‡, Chee-Hong Wong1‡, Cindy Choi1, Yuko Yoshinaga1, Katherine Louie1,2, Jing Jia3, Cindy Chen1, Benjamin Bowen1,2, Haoyu Cheng1, Lauriebeth Leonelli4, Rita Kuo1, Richard Baran1,2, José G. García-Cerdán4, Abhishek Pratap1†, Mei Wang1, Joanne Lim1, Hope Tice1, Chris Daum1, Jian Xu3, Trent Northen1,2, Axel Visel1,5,6, James Bristow1, Krishna K. Niyogi2,7 and Chia-Lin Wei1* Alga-derived lipids represent an attractive potential source of biofuels. However, lipid accumulation in algae is a stress response tightly coupled to growth arrest, thereby imposing a major limitation on productivity. To identify transcriptional regulators of lipid accumulation, we performed an integrative chromatin signature and transcriptomic analysis to decipher the regulation of lipid biosynthesis in the alga Chlamydomonas reinhardtii. Genome-wide histone modification profiling revealed remarkable differences in functional chromatin states between the algae and higher eukaryotes and uncovered regulatory components at the core of lipid accumulation pathways. We identified the transcription factor, PSR1, as a pivotal switch that triggers cytosolic lipid accumulation. Dissection of the PSR1-induced lipid profiles corroborates its role in coordinating multiple lipid-inducing stress responses. The comprehensive maps of functional chromatin signatures in a major clade of eukaryotic life and the discovery of a transcriptional regulator of algal lipid metabolism will facilitate targeted engineering strategies to mediate high lipid production in microalgae.

A

lgae naturally accumulate energy-dense oils that can be converted into transportation fuels, potentially rendering them an attractive system for large-scale biofuel production1. Algae-derived biofuels offer the promise of high areal productivity, minimal competition with food crops, utilization of a wide variety of water sources and CO2 capture from stationary emission sources2,3. However, high-yield lipid accumulation in algae is a stress response inducible through conditions like nutrient deprivation, which limits overall yield and thus commercial viability2,4. Extensive research efforts5–7 have been aimed at improving algal lipid productivity, but have failed to substantially boost intracellular lipid levels8. The microalga Chlamydomonas reinhardtii is one of the model organisms for studying algal growth and lipid metabolism. Although it is not an oleaginous alga for industrial biofuel production9, C. reinhardtii is known to accumulate triacylglycerol (TAG) during nutrient stress and is amenable to well-established classical molecular and genetic methods10,11. A high-quality and functionally annotated genome sequence is available in public repositories12, and large collections of mutant strains have been produced (http://chlamycollection.org). In higher plants, many stress-elicited responses are controlled at the level of transcriptional regulation13,14, particularly through the activation of microRNA15 or master transcription factors16,17. Because of substantial variation in transcript stability and degradation rates, transcript levels are an imperfect proxy for the transcriptional status of individual genes. This problem is likely to be exacerbated by the transient expression and potentially low abundance of stress-responding transcription factor transcripts,

rendering their identification through transcription profiling alone difficult. Despite growing amounts of transcriptome data13,18,19, the molecular mechanisms that govern algal lipid production have remained elusive and only a single algal transcription factor, NRR1, has been functionally implicated in lipid accumulation. Nrr1 is required for lipid accumulation only during nitrogen starvation in C. reinhardtii 13 and its role in other industrial oleaginous algae has not been recapitulated. In contrast to transcriptome profiling, distinct patterns of histone modifications can reveal active or repressed chromatin states20 and infer the transcriptional activity of the associated genes21,22. For instance, alterations in histone modifications have been used to identify central regulatory genes in the Arabidopsis leaf senescence process23. We thus hypothesized that a combination of chromatin state and transcriptome changes induced by lipid-inducing starvation conditions in C. reinhardtii may provide a sensitive and specific readout for detecting key switches controlling the lipid accumulation process. In this study, we constructed genome-wide maps of chromatin states and their dynamics in C. reinhardtii. Compared with patterns found in metazoans24,25 and land plants26, functional chromatin signatures in microalgae are a combination of both conserved and lineage-specific histone codes. We exploited chromatin signature changes to infer transcriptional regulators of lipid biosynthesis pathways and applied targeted genetic perturbation to confirm one of these transcription factor genes, PSR1, as a switch activating lipid accumulation. Our study provides insights into the regulation of TAG biosynthetic pathways and strategies for their targeted genetic engineering.

1

US Department of Energy Joint Genome Institute, Walnut Creek, California 94598, USA. 2 School of Natural Sciences, University of California, Merced, California 95343, USA. 3 Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA. 4 Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA. 5 Single-Cell Center, CAS Key Laboratory of Biofuels and Shandong Key Laboratory of Energy Genetics, Qingdao Institute of BioEnergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, Shandong 266101, China. 6 Howard Hughes Medical Institute, Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA. 7 Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA. † Present address: Sage Bionetworks, Seattle, Washington 98109, USA. ‡ These authors contributed equally to this work. e-mail: [email protected] * NATURE PLANTS | www.nature.com/natureplants

1

ARTICLES Results Mapping epigenomic changes in response to lipid-inducing conditions. To characterize chromatin states in C. reinhardtii and profile their changes in response to stress-induced lipid accumulation, we cultured C. reinhardtii cells under two different acute nutrient depletion schemes known to induce TAG accumulation, nitrogen and sulphur starvation13. A slow rate of cell growth and high lipid levels confirmed that the expected stress responses were achieved (Fig. 1). We used chromatin immunoprecipitation followed by sequencing (chromatin immunoprecipitation sequencing (ChIP-seq)) to profile the genome-wide distribution of RNA polymerase II (RNAPII) and five distinct post-translational modifications of histone H3, including trimethylation of lysine residues 4 (H3K4me3), 9 (H3K9me3)27, 27 (H3K27me3), 36 (H3K36me3) and acetylation of lysine 27 (H3K27ac) in control cells, cultured in tris-acetate-phosphate (TAP) media and 1 h after starvation under both nitrogen- and sulphur-starvation conditions (Fig. 1). The details of antibody characterization can be found in the Supplementary Information. ChIP-seq reads were mapped to the C. reinhardtii reference genome and used to determine modified regions with high reproducibility across biological replicates (overall reproducibility Pearson correlation, R > 0.96 in all cases, Supplementary Fig. 1 and Table 1). To monitor the transcriptional responses associated with chromatin changes at high temporal resolution, we also performed deep RNA-seq analysis throughout the course of nutrient depletion up to 48 h after starvation when lipid accumulation is pronounced. Comprehensive expression changes in both early (0–8 h, within one cell cycle) and late (24–48 h) phases were captured (Fig. 1). Similar to the epigenomic data, high correlations between biological replicates were observed (Pearson correlation ≥0.99, Supplementary Fig. 2a). To ensure that all transcripts specifically expressed in response to nitrogen and sulphur starvation were included in our analysis, we performed a reference-guided transcript assembly from these deep RNA-seq data sets, which revealed 298 previously unannotated transcripts (Supplementary Fig. 2b). Across all 22,209 transcripts assembled, approximately half are differentially expressed (more than twofold, P < 0.01) in at least one time point along the course of nitrogen or sulphur starvation (Supplementary Tables 2 and 3), consistent with the previous studies showing extensive transcriptional changes associated with nutrient starvation and lipid induction18,28,29. Plant- and alga-specific histone signatures. Similar to both land plants and metazoans, histone modifications in C. reinhardtii largely exhibit punctuated patterns across the C. reinhardtii genome and are primarily clustered within 1 kb of the transcription start sites (TSSs) of annotated genes (Supplementary Fig. 3a,b). Examination of individual histone marks revealed similarities, but also marked differences compared with the wellcharacterized histone code of animals and land plants. For example, H3K9me3 is associated with repressed heterochromatin in vertebrates30, but nearly universally (96% of H3K9me3 regions) co-localizes with active marks H3K4me3 or H3K27ac in C. reinhardtii (Fig. 2a and Supplementary Fig. 3c). Although a general transcription-associated and promoter-centric distribution of H3K9me3 was observed in Arabidopsis 26, the co-occurrence of H3K9me3 with active marks and mutual exclusion with repressive mark H3K27me3 observed in C. reinhardtii may be restricted to algae or could represent a previously unappreciated general plantspecific histone signature. The distribution of H3K36me3 on the C. reinhardtii genome is divergent from both vertebrates31 and in land plants26, where H3K36me3 spans broad regions along actively transcribed genes. Distinct from the pattern found in Arabidopsis, this marker is largely (90%; 7,978 out of 8,873 2

NATURE PLANTS

DOI: 10.1038/NPLANTS.2015.107

regions) confined to active promoters in C. reinhardtii (Supplementary Fig. 3d). The distinct peak shapes across species could impact on their utility for the functional interpretation of chromatin states, with the narrow, sharp, promoter-centric peaks in C. reinhardtii potentially being a key feature for characterizing promoter activity (see below). Hence, although the general existence of these histone modifications is highly conserved, their functions appear to have diverged across the different eukaryotic clades. Because the individual histone patterns were found to be unique in C. reinhardtii, we adopted an unsupervised approach to systematically analyse combinatorial patterns from these histone modifications and RNAPII occupancy32, which led to the identification of 16 distinct chromatin states (CS). Most of the genomic regions (87%) are devoid of any modification (CS 16). The remaining 15 states contain one or more marks in different combinations and are associated with different genomic locations (Fig. 2b and Supplementary Fig. 4a). CS 1–5 are mainly defined by H3K27me3, H3K36me3 and RNAPII and distributed among nonpromoter regions, whereas CS 6–15 are mainly defined by H3K27ac, H3K4me3 and H3K9me3 and found around promoter regions (Supplementary Fig. 4b). Two states, CS 2 and CS 15, are of particular interest in comparison to known animal and plant chromatin signatures. CS 2 represents co-modified domains of an active mark (H3K36me3) and a repressive mark (H3K27me3) (Fig. 2c). Although such active and repressive co-modified regions have been observed in animal cells33, they have not been observed in land plants like A. thaliana. In support of the co-modified pattern, C. reinhardtii transcripts associated with H3K36me3/ H3K27me3 are expressed at substantially lower levels than those without co-modifications (P < 2.2 × 10−16, Fig. 2c, right panel). In contrast, CS 15 is defined mainly through H3K27ac and, on the basis of what has been observed in metazoan genomes, this signature is overall a characteristic of distal transcriptional enhancers34. To evaluate its functional conservation, we profiled the H3K4me2, a known enhancer mark co-localized with H3K27ac in mammalian cells35, in log-phase C. reinhardtii cells and found that CS 15 is enriched for H3K4me2 modification (Fig. 2d). Experimental validation of individual sequences identified by this signature confirmed their enhancer activity in 3 out of 11 cases tested (one-tailed Mann– Whitney test with P < 0.05) in a heterologous tobacco enhancer reporter assay, despite being distally related to algae (Supplementary Fig. 5). These data indicate the presence of potential distant-acting regulatory elements similar to those extensively characterized in vertebrate genomes36 in algae and suggest that these elements may play an important role in coordinating gene expression in algae and possibly other plants. Promoter histone modification patterns reflect genes transcriptional status. Histone modification enrichment profiles in gene promoter regions reflect transcription activities (Fig. 3a). Among all 16 different chromatin states defined, five types of histone modification patterns were associated with nearly all (20,843; 94%) transcript promoters in C. reinhardtii (Fig. 3b). These five types differ mainly by progressive addition of modifications ranging from Type I promoter depleted of any mark to Type V promoter with all four marks (H3K4me3, H3K27ac, H3K9me3 and H3K36me3) and the presence of RNAPII. Such a progressive addition pattern is distinct from the combinatorial pattern observed in Arabidopsis 26 (Supplementary Fig. 6). The majority of the promoters carry all three active modifications (H3K4me3, H3K9me3 and H3K36me3) in C. reinhardtii whereas in A. thaliana promoters are equally found with either none or all three marks. Such distinction between these two species suggests fundamental differences in the mechanism of interactions NATURE PLANTS | www.nature.com/natureplants

NATURE PLANTS

ARTICLES

DOI: 10.1038/NPLANTS.2015.107

Cell density (×106 ml−1)

a

C. Reinhardtii (control, TAP, log phase)

15 TAP (control) 10 5

N-starvation S-starvation

0

Lipid level (RFU per cell)

0h

48 h

24 h

0.02 N-starvation S-starvation 0.01

TAP (control)

0.00 0h

Sampling time points 0h

30 min 10 min

2h

6h

24 h

48 h

24 h

48 h RNA RNA + chromatin

8h

1h Inactive → active state (for example LCR1)

Active → inactive state (for example Cre02.g110500)

b

TAP

N−

S−

TAP

N−

S−

H3K27ac

Chromatin state

H3K4me3

H3K9me3

H3K36me3

RNAP II

Gene expression (FPKM)

g2267.t1

g2267.t1

g2268.t1

g2268.t1 g2267.t1

g2268.t1

TSS

g9922.t1

12

30

10

25

8

20

6

g9923.t1

g9922.t1

g9923.t1

g9922.t1

g9923.t1

N-starvation

15

S-starvation

4

S-starvation

10 N-starvation

2

5

0

0 0.0

0.5

1.0

1.5

2.0

Time (h)

0.0

0.5

1.0

1.5

2.0

Time (h)

Figure 1 | An integrative epigenetic and transcriptomic strategy to identify lipid regulators in C. reinhardtii. a, C. reinhardtii cells in log phase were subjected to acute nitrogen (N−) and sulphur (S−) depletion for 48 h. Cell growth and lipid accumulation were measured to confirm the effect of nutrient starvation (n = 3, error bars indicate s.d.). RNA expression, histone modifications and RNAPII occupancy were profiled. Cells were harvested from two independent cultures from each time point as biological replicates. b, Genes whose TSSs display inactive (left) and active (right) chromatin state changes were selected to evaluate their temporal RNA expression patterns. FPKM, fragments per kilobase of transcript per million mapped reads.

between histone-modifying enzymes and their target chromatin regions. Transcript abundance levels are highly correlated with the chromatin state of their respective promoters. Each consecutive class is associated with a significant increase in expression (P < NATURE PLANTS | www.nature.com/natureplants

1.7 × 10−9 in all cases, Wilcoxon rank sum test). Quantitative increases are most pronounced in Type IV vs. I–III (8.6-fold, P < 2.2 × 10−16) and V vs. IV (2.4-fold, P < 2.2 × 10−16), characterized by the addition of H3K36me3 and RNAPII, respectively. For all 3

NATURE PLANTS

H3K27me3

RNAPII

H3K36me3

H3K9me3

H3K27ac

a

H3K4me3

Ch ro m at in

Pr o fu po n s C. ctio ed re n in in ha rd tii

st at e H3 K2 7a H3 c K4 m e H3 3 K9 m e3 H3 K3 6m RN e3 AP H3 II K2 7m e3 No .o ft ra ns cr ip ts Ge no lo mic ca tio n

b

DOI: 10.1038/NPLANTS.2015.107

Ve r co teb m rat pa e ris on

ARTICLES

CS 1

Intergenic

Repressed

CS 2

Intragenic

Bivalent

Algal

Intragenic

Transcribed regions

Conserved

CS 3 CS 4 CS 5

Conserved

Conserved

3’ gene

Conserved

CS 7

Conserved

H3K4me3

CS 8

Conserved

CS 9

H3K9me3

9,670 (44%)

CS 10 CS 11

3,087 (14%)

Promoter 5’ gene

Promoter

Algal

CS 12

RNAPII H3K27me3

CS 13

3,166 (14%)

Promoter

CS 14

1,062 (5%)

Promoter

Conserved

CS 15

0

% overlap

3,811 (17%)

Closed chromatin

10,000 RNA expression (FPKM+1, log10)

C. reinhardtii

H. sapiens H3K4me3

H3K27me3

H3K36me3 Gene model

d

Bmp2

1,000

P < 2.2 × 10−16

100 50 25 10 5 1 0

g470

Co-modified genes

g919.t1 H3K4me2

Gene model

Input

6

g918.t1

Non-co-modified genes

3.5 kb

7 kb

Chromatin state 15: Putative enhancer 8

Normalized ChIP signal

Conserved

100

Chromatin state 2: Co-modified domain

Bmp2

1

Conserved

CS 16

c

Frequency

CS 6

H3K27ac

H3K36me3

0

Algal

g920.t1

g919.t2 Amino acid permease

H3K27ac 4

H3K4me2

2

RNAPII

0 1.0

0.5

H3K27ac

H3K4me3 0.5

1.0

Distance from the centre (kb)

State 15 region

Figure 2 | Chromatin states analysis reveals unique signatures in C. reinhardtii. a, Pairwise marks co-occurrence in C. reinhardtii. Overlap is defined as the ratio of co-occupied regions between the row and column marks over the number of row mark’s regions. b, Characterization of predicted chromatin states. c, Co-modified domain in C. reinhardtii and Homo sapiens (ENCODE data) are shown. Co-modified chromatin state-associated transcripts are expressed at a significantly (Wilcoxon rank-sum test) lower level (right). d, Putative enhancer state. H3K4me2 enrichment at distal H3K27ac marked regions (left). A putative enhancer region (chr_1:5,750,235-5,751,235) is shown (right).

following analyses, Types IV and V were considered transcriptionally active in C. reinhardtii cells (Fig. 3b). Despite these overall highly significant correlations, within each promoter class a wide range of transcript levels was observed. This variation may result from a combination of inherently different transcription rates from individual promoters (‘weak’ vs. ‘strong’ promoters) and from differences in post-transcriptional RNA 4

stability, highlighting the limitations of transcriptome data alone for inferring the transcriptional status of individual genes. Taken together, this analysis of the chromatin landscape of C. reinhardtii suggests that these promoter chromatin state assignments may provide a substrate for sensitive and accurate identification of regulatory genes such as transcription factors responding to changes in lipid metabolism. NATURE PLANTS | www.nature.com/natureplants

NATURE PLANTS a

ARTICLES

DOI: 10.1038/NPLANTS.2015.107

−4

3

−4

3

−3

3

H3K4me3

−4

3

H3K9me3

−2

2

H3K36me3

−1.8

RNAPolII

2.1 H3K27me3

Log2 FPKM

H3K27ac

2 kb

2 kb

2 kb

2 kb

5’

2 kb

2 kb

3’

5’

b

2 kb

Gene

Gene

Gene

3’

5’

2 kb

2 kb

2 kb

3’

2 kb

5’

3’

5’

2 kb Gene

Gene

Gene

3’

5’

3’

RNAPII: H3K36me3: H3K9me3: H3K4me3: H3K27ac:

Promoter type Chromatin state No. of transcripts

Type I (CS 16) 3,811

Type V (CS 7+11) 3,094

Type IV (CS 8+9) 9,710

Type III (CS 13) 3,166

Type II (CS 14) 1,062

2.

2

×

10 −

16

10,000

< P 16

10 − × 2 2. < P

9

10 − × P