Mapping RNA exit channel on transcribing RNA polymerase II by ...

4 downloads 0 Views 714KB Size Report
Jan 6, 2009 - various hypotheses of RNA exit channel by using fluorescence resonance ... triangulation, reveal the exit track of RNA transcript on core pol II.
Mapping RNA exit channel on transcribing RNA polymerase II by FRET analysis Chin-Yu Chena,1, Chia-Chi Changa,1, Chi-Fu Yena,1, Michael T.-K. Chiua, and Wei-Hau Changa,b,2 aInstitute

of Chemistry and bGenomic Research Center, Academia Sinica, Taipei 115, Taiwan

Communicated by Roger D. Kornberg, Stanford University School of Medicine, Stanford, CA, November 18, 2008 (received for review March 26, 2008)

nanometry 兩 structure 兩 transcription 兩 in-gel 兩 single-molecule fluorescence

R

NA polymerase II (pol II), a protein complex containing 12 subunits, Rpb1–Rpb12, of a total mass of ⬇500 kDa and size ⬇100–140 Å, is the enzyme machinery synthesizing mRNA in all eukaryotes (1). X-ray studies of pol II complexes (2–4) led to an atomic model containing structural elements with functional implications (Fig. 1A). In a transcribing pol II, between the ‘‘clamp’’ and ‘‘jaw’’ domain, lies a cleft (4) that harbors the active center, a straight duplex DNA and an RNA–DNA hybrid (position ⫹1 to ⫺8, ⫹1 denoting the nucleotide addition site). The strand separation of RNA from DNA template occurs upstream of the hybrid at positions ⫺9 and ⫺10, facilitated by a set of protein loops including the ‘‘lid’’ domain as a driving wedge. Nascent RNA moves through an exit pore from the active center, crossing a saddle-like surface, beneath an ‘‘arch’’ bridging the clamp and wall (5). How does pol II instruct the nascent RNA to exit beyond the saddle? Is there a unique path on pol II connecting the active center to its exterior that nascent RNA may follow? To date, insights into the RNA exit have come from analysis of pol II surface charge distribution: two positively charged grooves, on either side of the ‘‘dock domain’’ (Fig. 1 A), can accommodate ssRNA (5). One groove, putatively referred as ‘‘exit channel 1,’’ runs around the base of the clamp, leading toward the stalk of subcomplex Rpb4–Rpb7, which can bind RNA via its ribonucleoprotein fold (6, 7). The other groove, termed ‘‘exit channel 2,’’ runs down the back side of pol II, through Rpb3 and Rpb11, leading toward Rpb8, a subunit equally competent in RNA binding by its single-strand nucleic acid-binding motif. Intriguingly, exit channel 1 would cause the RNA to bend sharply, implying that channel 2 is energetically favored for RNA binding. Yet, evidence in support of the channel 1 hypothesis has come from observations of the nascent RNA cross-linking to Rpb7 subunit of pol II (8). The channel 1 hypothesis is tantalizing, given that the Rpb4– Rpb7 subcomplex is dispensable for RNA synthesis in yeast (9). Attempts to identify the exit channel by X-ray studies of the pol www.pnas.org兾cgi兾doi兾10.1073兾pnas.0811689106

Fig. 1. Pol II elongation complex. (A) Surface representation of a pol II elongation complex (PDB ID code 1Y1W) with structural elements highlighted: clamp in wheat, wall in violet, lid in green, rudder in yellow, fork loop 1 in orange, jaw in purple, and dock domain in aquamarine. Two putative RNA exit channels are indicated by red dash, labeled with 1 and 2. (B) The red star denotes a Cy5 on 10-nt RNA (GE2), situated next to the saddle. Green beads denote the Cy3 near the C terminus of Rpb4 in light pink or that of Rpb3 in cyan. The distance of Rpb4 –GE2 is 82 Å and that of Rpb3–GE2 is 65 Å, respectively. (Inset) Cy3-CaM (orange) bound to CBP (light gray) extended from the C terminus of Rpb3 or Rpb4 in the presence of Ca2⫹ ions (yellow). (C) A design of triplet used for in-gel FRET. (Upper) Mixture of labeled and unlabeled pol II elongation complexes. Rpb3 or Rpb4 subunit is highlighted in pink; CaM is an orange dumbbell; Cy3 is green, and Cy5 is red. (Left) Pol II elongation complex with RNA unlabeled, a mixture of Cy3-CaM bound or unbound in the upper band, free Cy3-CaM in the lower band. (Middle) Pol II elongation complexes labeled with Cy5-RNA and Cy3-CaM. (Right) Same as Middle except unlabeled CaM is used. (D) Immobilized single molecules of pol II elongation complexes on a coated slide. (Left) Donor only, labeled with Cy3-CaM. (Middle) Donor and acceptor, with Cy3-CaM and Cy5-RNA. (Right) Acceptor only, with Cy5-RNA. A and B were prepared by PyMOL (50) (www.pymol.org).

II elongation complex failed to detect RNA longer than 10 nt (10). An alternative approach is thus required to address this issue. Fluorescence resonance energy transfer (FRET) is a spectral ruler (11) to gauge distance between 1 and 10 nm (12), Author contributions: W.-H.C. designed research; C.-Y.C., C.-C.C., and C.-F.Y. performed research; C.-Y.C., C.-C.C., C.-F.Y., M.T.-K.C., and W.-H.C. analyzed data; and C.-Y.C. and W.-H.C. wrote the paper. The authors declare no conflict of interest. Freely available online through the PNAS open access option. 1C.-Y.C.,

C.-C.C., and C.-F.Y. contributed equally to this work.

2To whom correspondence should be addressed at: Institute of Chemistry, Academia Sinica:

128 Academia Road, Section 2, Nankang, Taipei 115, Taiwan. E-mail: weihau@chem. sinica.edu.tw or [email protected]. This article contains supporting information online at www.pnas.org/cgi/content/full/ 0811689106/DCSupplemental. © 2008 by The National Academy of Sciences of the USA

PNAS 兩 January 6, 2009 兩 vol. 106 兩 no. 1 兩 127–132

BIOPHYSICS

A simple genetic tag-based labeling method that permits specific attachment of a fluorescence probe near the C terminus of virtually any subunit of a protein complex is implemented. Its immediate application to yeast RNA polymerase II (pol II) enables us to test various hypotheses of RNA exit channel by using fluorescence resonance energy transfer (FRET) analysis. The donor dye is labeled on a site near subunit Rpb3 or Rpb4, and the acceptor dye is attached to the 5ⴕ end of RNA transcript in the pol II elongation complex. Both in-gel and single-molecule FRET analysis show that the growing RNA is leading toward Rpb4, not Rpb3, supporting the notion that RNA exits through the proposed channel 1. Distance constraints derived from our FRET results, in conjunction with triangulation, reveal the exit track of RNA transcript on core pol II by identifying amino acids in the vicinity of the 5ⴕ end of RNA and show that the extending RNA forms contacts with the Rpb7 subunit. The significance of RNA exit route in promoter escape and that in cotranscriptional mRNA processing is discussed.

ideal for mapping pairs of probes on a complex (13, 14) as large as pol II. Indeed, FRET analysis of Escherichia coli RNA polymerase, a counterpart of pol II in bacteria, pioneered by Ebright and coworkers (15–18), revealed the spatial organization of the promoter complex, retention of ␴70 and a DNAscrunching mechanism at initiation. Those FRET studies of E. coli RNA polymerase were facilitated by assembling the enzyme complex from its individual subunits, which could be specifically dye-labeled before reconstitution. A similar FRET approach to pol II has been impeded by lack of a reconstituting system, except that the dissociation of Rpb4– Rpb7 from core pol II can be exploited (9). Here, we introduce a simple scheme for specifically labeling virtually any subunit in a TAP-tagged (tandem affinity purification) protein complex (19, 20). Briefly, Cy3-conjugated calmodulin (CaM) is used to poise a Cy3 dye near the C terminus of a TAP-tagged pol II subunit by its binding to the CaM-binding peptide (CBP) on the subunit (Fig. 1B). With a Cy5 dye attached to the 5⬘ end of RNA, our scheme allows us to test the hypothesis about the RNA exit channel on pol II by FRET analysis. If channel 1 is preferred, we would expect an increase in FRET efficiency between Cy3 near the C terminus of Rpb4 and Cy5 on the 5⬘ end of RNA, as the RNA extends. Conversely, should channel 2 be preferred, there would be an increase in FRET efficiency between Cy3 near the C terminus of Rpb3 and Cy5 on the 5⬘ end of RNA with extension of the RNA. In the present work, two independent FRET measurements are performed: ‘‘in-gel FRET’’ (Fig. 1C) (13, 14) and ‘‘single-molecule FRET’’ (Fig. 1D) (21, 22). The former is a bulk measurement, facilitated by separation of pol II from unbound Cy3-CaM or Cy5-RNA in a native gel, whereas the latter allows real-time recording of the ‘‘double-labeled’’ complex to reveal dynamics and distributions. For simplicity, we employ the following notation to denote the sites of labeling and the corresponding FRET efficiency measurement in the subsequent text. For instance, ‘‘FRET efficiency between Cy3 near the C terminus of Rpb4 subunit of pol II and Cy5 attached at the 5⬘ end of GE2 RNA (10 nt)’’ is referred to as ‘‘FRET of Rpb4–GE2.’’ Results Activity of Labeled Pol II Elongation Complexes. In this work, pol II

elongation complexes are obtained by assembling 12-subunit pol II [supporting information (SI) Fig. S1 A] with nucleic acid scaffold, to mimic complexes at discrete points along the trajectory of elongation. The hybridizing region of RNA with the template DNA is kept to 8 bp (10), whereas the 5⬘ nonhybrid growing end is 2, 9, 18, termed GE2, GE9, GE18, respectively (GE denotes ‘‘growing end’’). RNA, together with template DNA, forms a stable complex with pol II (5, 10). The maximum number of nucleotides in the upstream nonhybridizing RNA was chosen such that it may fully span the channel. [Note that each candidate channel measures ⬇45 Å from the saddle to its outlet on core pol II (10 subunits), and hence both channels can accommodate 13–16 nt (5).] An in vitro transcription assay was conducted to assure that pol II elongation complexes retain their ability to extend RNA, either with Cy5 attached to the 5⬘ end of RNA (Fig. S1B) or with a few millimeter calcium ions (Fig. S1C). It was important to ensure that calcium ions, required for CaM–CBP interactions, did not interfere with pol II elongation because such ions abolish transcription activity in a related system (23). Based on the unaltered function of the pol II elongation complexes, we believe that the labeling used in our experiments would not cause significant structural perturbations of the complexes. In-Gel FRET Analysis. Bulk FRET measurements in solution have met practical obstacles. The sensitivity of fluorescence spectrometers is typically in the concentration range of ⬇micromolar, implying that ⬇milligrams of fluorescence-labeled pol II are 128 兩 www.pnas.org兾cgi兾doi兾10.1073兾pnas.0811689106

Fig. 2. In-gel FRET efficiencies as a function of the length of RNA. (A) Gel image of triplets scanned in the Cy3 channel. (Upper) From Cy3-CaM on Rpb3 subunit in pol II elongation complex. (Lower) From unbound Cy3-CaM. Lanes 1–3, 4 – 6, and 7–9 represent three triplets of elongation complexes containing GE2 (10 nt), GE9 (17 nt), and GE18 (26 nt), respectively. (B) X–Y plot of FRET efficiencies between Cy3 on Rpb3 and Cy5 on RNA, extracted from replica images of A, as a function of the length of RNA. ■, Rpb3㛭1: Cy3-CaM labeling on pol II is 10% and RNA binding 40% ( fB). F, Rpb3㛭2: Cy3 labeling is 30% and RNA binding 40% ( fB). (C) Same as in A except Cy3-CaM is on Rpb4. (D) Extracted from replica of C. ■, Rpb4㛭1: Cy3 labeling was 5% and RNA binding 55% ( fB). F, Rpb4㛭2: Cy3 labeling was 12% and RNA binding 40% ( fB).

required just to generate a data point. We therefore resorted to an alternative bulk method by using a native gel, requiring less material. In native gel, later referred to as in-gel, pol II complexes can be separated from the unbound Cy3-CaM (⬇20 kDa), in the low-molecular mass region (Fig. 2 A and C). The upper band corresponds to a mixture of labeled and unlabeled pol II complexes. [This upper band appeared as a singlet or a doublet, depending on the phosphorylation state of the CTD of the Rpb1 in pol II (24–26)]. fB and fOA were measured and summarized (see Gel Quantitation in SI Materials and Methods and Tables S1 and S2). In-gel FRET efficiencies between a pol II subunit and RNA of various lengths, GE2, GE9, and GE18, were compared. Raw FRET efficiencies were measured as described in Materials and Methods (also in SI Materials and Methods). As the RNA extended from GE2 to GE18, an increase in raw FRET between Rpb4 was observed repeatedly (Fig. 2D), immediately suggesting that nascent RNA is leading toward the Rpb4–Rpb7. By contrast, no gain in raw FRET was observed between Rpb3 and RNA as the length of the RNA was extended (Fig. 2B). Average authentic in-gel FRET efficiencies were calculated by using measured fB and fOA and converted into distances (Table 1). To challenge whether those distances were plausible, in-gel FRET efficiencies between Cy3-DNA and Cy5-RNA of various lengths were measured (Fig. S2) and found to be consistent with the triangulation results based on Rpb3–RNA and Rpb4–RNA distances (Table 1 and Table S2). Single-Molecule FRET Analysis. Even though in-gel FRET provided

reliable distances for tracking nascent RNA in the pol II Chen et al.

Table 1. Results from average in-gel FRET and single-molecule FRET (smFRET) measurements Cy3 site

RNA

X-ray model (1Y1W)

In-gel FRET

Distance in-gel FRET, Å

Rpb3

GE2

64 Å*

0.34, 1.11 R0

64

GE9

NA

0.28, 1.17 R0

68

GE18

NA

0.16, 1.32 R0

77

GE2 GE9

82 Å* NA

0.18, 1.29 R0 0.39, 1.08 R0

80 67

GE18

NA

0.78, 0.81 R0

50

GE2 GE9 GE18

50 Å* NA NA

0.51, 0.99 R0 0.38, 1.08 R0 0.16, 1.32 R0

50 55 (60 ⫾ 5)† 67 (65 ⫾ 5)‡

Rpb4

DNA

Single-molecule FRET

Distance smFRET, Å

0.40 (28%), 1.07 R0 0.49 (72%), 1.01 R0 0.32 (83%), 1.13 R0 0.42 (17%), 1.06 R0 0.18 (67%), 1.29 R0 0.51 (33%), 0.99 R0 0.17 (⬎95%), 1.3 R0 0.18 ( 9%), 1.29 R0 0.30 (91%), 1.15 R0 0.29 (13%), 1.16 R0 0.63 (87%), 0.92 R0 0.44, 1.05 R0 0.34, 1.13 R0 Close to noise

64 60 68 63 77 59 78 77 69 70 55 50 54 (60 ⫾ 5)† NA (65 ⫾ 5)‡

elongation complex, replica FRET experiments were performed with the single-molecule method for cross-validation. In the immobilized single-molecule scheme, dual-labeled complexes are selected; thus, the information of dye-labeling efficiency, a bulk quantity, is dispensable. Single-molecule time traces of raw fluorescence intensities and calculated FRET efficiencies are shown in Fig. S3, and constructed histograms are in Fig. 3. For Rpb3–GE2, the FRET histogram can be fitted to two Gaussian distributions, centering at 0.49 (Fig. 3A P5) and 0.40 (Fig. 3A P4), respectively. As the RNA extends to GE9, the FRET histogram shifts toward the low-FRET regime with the major distribution centering at 0.32 (Fig. 3A P2), indicating that the distance between Rpb3 and GE9 is longer than that between Rpb3 and GE2. For Rpb4–GE2, the FRET histogram shows a major distribution centering at 0.17 (Fig. 3B P1). As the RNA extends to GE9, the FRET histogram shifts toward the highFRET regime, and it can be fitted to two Gaussian distributions with the major one centering at 0.3 (Fig. 3B P3), indicating that Rpb4 is closer to GE9 than to GE2. As the RNA extends further to GE18, changes of FRET values follow the same trend, while

broadening in the distributions is observed, and minor populations of ‘‘anomalous’’ FRET emerge: high for Rpb3 (Fig. 3A P6) and low for Rpb4 (Fig. 3B P4). The peak FRET values of the major single-molecule populations are summarized (Table 1), in good agreement with the corresponding in-gel FRET efficiencies that virtually indistinguishable distances, within 5-Å errors, can either be generated from single-molecule data or from in-gel data (Table 1). Thus, single-molecule FRET data also support that the majority of nascent RNA molecules, if not all, exit through channel 1 on pol II. Structural Mapping of RNA Exit Based on Single-Molecule FRET RNA GE2 (10 Nucleotides). By using single-molecule FRET efficiencies,

0.49 for Rpb3–GE2 (Fig. 3A P5) and 0.17 for Rpb4–GE2 (Fig. 3B P1) and a Fo ¨rster distance R0 ⬇60 Å for Cy3–Cy5 (22, 27–29), distances of 61 Å (1.01 R0) and 78 Å (1.30 R0) are obtained for Rpb3–GE2 and Rpb4–GE2, respectively (Table 1), agreeing well with 64 Å and 82 Å, the corresponding distances in the crystal structure (10). An additional single-molecule FRET between Cy5–GE2 and Cy3–DNA, termed DNA-GE2, is mea-

Fig. 3. Single-molecule FRET histograms. (A) Reconstructed from many leakage-QE-corrected time traces of FRET efficiencies, between Cy3-Rpb3 and Cy5 attached to the 5⬘ end of RNA of various lengths: GE2 (10 nt), GE9 (17 nt), and GE18 (26 nt). (B) Same as A except Cy3-CaM is on Rpb4.

Chen et al.

PNAS 兩 January 6, 2009 兩 vol. 106 兩 no. 1 兩 129

BIOPHYSICS

R0 of in-gel FRET, 58 Å (Rpb3); 62 Å (Rpb4); 50 Å (DNA); R0 of smFRET: 60 Å (Rpb3, Rpb4); 48 Å (DNA). *The distance is measured from X-ray model (1Y1W). †Predicted by triangulation based on Rpb3–GE9 and Rpb4 –GE9. ‡Predicted by triangulation based on Rpb3–GE18 and Rpb4 –GE18.

3 A and B), regardless on which subunit Cy3 is placed. Such broadening, originating from fluctuations in the time traces (Fig. S3 C and G), can be a signature of RNA flexibility because of its dislodging from pol II. Discussion Principal Findings. The RNA exit channel has been a hypothetical

Fig. 4. Locations of the 5⬘ end of RNA. The 5⬘ end of GE9 (17 nt) is in salmon, next to the dock domain in aquamarine; the 5⬘ end of GE18 (26 nt) is in orange, on the Rpb7 ribonucleoprotein-binding domain in pink; also the 5⬘ end of GE2 (10 nt) is in red, the C terminus of Rpb3 in cyan, and that of Rpb4 in violet. The figure was generated by PyMOL (50) (www.pymol.org) and the program O (51).

sured, and a shorter Fo ¨rster distance R0 ⬇48 Å for Cy3–Cy5 is required to fit with the corresponding distance (⬇50 Å) in the crystal structure (Fig. S4). Interestingly, such a reduction of Fo ¨rster distance has also been observed in in-gel data of DNA-GE2 (Table 1). Localization of the 5ⴕ End of the RNA GE9 (17 Nucleotides). Assuming that Fo ¨rster distance R0 ⬇60 Å is applicable, distances of 68 Å and 69 Å are obtained for Rpb3–GE9 and Rpb4–GE9, respectively, and are summarized in Table 1. By allowing a ⫾5 Å error, a unique site defined by a set of amino acids residing in the presumed exit channel 1 on core pol II is found by triangulation (Table S3, and salmon sphere in Fig. 4). The position of GE9, predicted based on the triangulation of Rpb3–GE9 and Rpb4– GE9, spans a distance of 60 ⫾ 5 Å to the Cy3 site of DNA (Cy3 dye attached between G11 and T12), consistent with the distance calculated from the single-molecule FRET data of DNA-GE9 with Fo ¨rster distance R0 ⬇48 Å (Table 1 and Fig. S4). The distance from the 5⬘ end of the GE2 (10 nt) to that of GE9 (17 nt) is determined to be 25–30 Å. As expected, this span is capable of accommodating 7–11 nt. Henceforth, we refer to exit channel 1 as the exit channel. GE18 (26 Nucleotides) and RNA Dynamics. By using single-molecule

FRET values, 0.18 for Rpb3–GE18 (Fig. 3A P1) and 0.62 for Rpb4–GE18 (Fig. 3B P5), distances of 77 Å and 55Å are obtained, respectively (Table 1). Triangulation with these distances identifies a site on the ribonucleoprotein-binding domain of Rpb7 (Table S3) (6, 10), shown as an orange sphere (Fig. 4). The distance between the 5⬘ end of GE18 (26 nt) and Cy3 site of DNA (Cy3 dye attached between G11 and T12) is predicted to be 65 ⫾ 5 Å, resulting in low FRET efficiencies, challenging to be detected by our single-molecule instrument (Table 1). The finding that GE18 (26 nt) contacts Rpb7 lines up with the previous study of the 5⬘ end of nascent RNA of 23–29 nt cross-linking to Rpb7 (8). The trajectory from the exit pore to the Rpb7 site deviates slightly from that of the exit channel, which would produce an energy penalty that could be compensated by RNA interacting with the ribonucleoprotein-binding domain. Interestingly, as the RNA extends to GE18 (26 nt), the distribution in the FRET histogram exhibits a broadening (Fig. 130 兩 www.pnas.org兾cgi兾doi兾10.1073兾pnas.0811689106

entity in pol II elongation complex structure, on which the existence of two charged grooves leads to different models of the RNA exit pathway. In this work, by introducing a simple method to label the pol II subunit, we test the models by measuring the FRET efficiencies between a donor on a subunit and an acceptor on the 5⬘ end of RNA. Observations of markedly different trends of change of FRET efficiencies vs. RNA length in a native gel prove that the RNA exit channel leads toward Rpb4–Rpb7, not Rpb3–Rpb11. Quantitative FRET analysis shows that in-gel data are both self-consistent and in agreement with single-molecule data. By identifying amino acids in the vicinity of the 5⬘ end of RNA (Table S3), using distances calculated from singlemolecule data and triangulation, we map the track of nascent RNA on pol II, which bends ⬇90° from the direction of the DNA template. Such a bending presents a remarkable structure feature that can prevent nascent RNA from meeting with the DNA template. The location of the 17-nt RNA enables us to predict which nucleotide would be the last before nascent RNA extrudes from the body of core pol II. The distance between the 17-nt RNA to the outlet of channel is measured to be ⬇15–20 Å, capable of accommodating 4–6 nt (5, 10), suggesting that the last nucleotide must lie between 21 and 23 nt, and RNA longer than 23 nt must extrude into the exterior of core pol II. Such a picture is remarkably consistent with a previous observation that the 5⬘ end of nascent RNA could cross-link to Rpb1 when RNA was shorter than 21 nt (8). As RNA extends to 26 nt, its 5⬘ end could contact a site within the ribonucleoprotein-binding domain in Rpb7 (Fig. 4), which also confirms the cross-linking study (8). Remarks on FRET-Based Structural Biology. Our study represents a

case where a judicious choice of sites for a FRET pair is crucial for generating interpretable FRET data for ‘‘molecular nanometry,’’ to complement the high-resolution study of protein complexes. In the literature, single-molecule FRET has been most restricted to studies on revealing molecular heterogeneity and/or dynamics. With this regard, our work serves as a milestone in the application of single-molecule FRET to structural biology of protein complexes. In this study, the awesome power of the C terminus labeling scheme has not been fully harnessed. In principle, generation of a dozen FRET distances to a site on pol II, because there are a dozen of subunits in pol II, can help solve the structure problem in an overdeterministic fashion. Although both in-gel FRET and single-molecule FRET provide equally good information in our case, the single-molecule approach is preferred for practical reasons. First, the single-molecule method requires much less material, so it may be applicable to scarce eukaryotic complexes. Second, the conversion of raw in-gel FRET data to distances is laborious because it requires accurate characterization of binding efficiencies and optical properties of the reagents (Table S2). Nevertheless, one caution about our single-molecule experiments is Cy3-CaM falling off from pol II because the measurements were carried out in a picomolar concentration whereas the affinity between CaM and CBP is ⬇nanomolar. Biological Significance. In the RNA exit channel, the interactions between nascent RNA and pol II in the region between 10 and 17 nt are expected to be very strong (30), for it is known that such interactions contribute to the stability of an elongation complex in prokaryotic RNA polymerase. Recent structural studies on the complexes formed by TFIIB with pol II have shown that the Chen et al.

Reconciliation. While our article was in revision, an independent

single-molecule FRET study of a pol II elongation complex was published by Michaelis and coworkers (38), who labeled the dissociable subcomplex Rpb4–Rpb7 and reconstitution pol II of 12 subunits. Both studies support that the nascent RNA exits from pol II through channel 1. However, there seems to be a discrepancy as to where 26-nt RNA could contact. Contrary to our observation that the 5⬘ end of 26-nt RNA contacts Rpb7 in most elongation complexes, Michaelis and coworkers have suggested that the 5⬘ end of 26-nt RNA can occupy the ‘‘dock domain’’ on pol II (38). Intriguingly, our single-molecule data of 26-nt RNA reveal minor populations of anomalous FRET for Rpb3–GE18 (Fig. 3A P6) and Rpb4–GE18 (Fig. 3B P4) as well. Triangulation based on these anomalous distances predicts that the 5⬘ end of 26-nt RNA can reside on the dock domain, as suggested by the Michaelis group. By Boltzmann statistics, we estimate that there is an energy gap, ⬇1–2 kBT per complex, between the Rpb7-contacting complex and the dock domaincontacting complex. Such a gap could be partially addressed by the bending of nucleic acids. As shown in the model of RNA exit (Fig. 4), RNA bent toward the dock domain-contacting position requires larger angles than that bent toward Rpb7 and thus is energetically unfavorable because the physics of bending RNA demands energy of kBT(⌬␪)2(␰/2L), where ␰ is the persistent length of ssRNA, ⬇1–1.4 nm, and L is the contour length of the ssRNA (39, 40). Of course, interactions of RNA–Rpb7 and those between RNA and dock domain must be taken into account to complete the analysis. We speculate that systematic variations in experimental conditions may influence the behavior of RNA in the pol II elongation complex. For instance, certain cations, ammonium, zinc, magnesium, and calcium, are either present or absent in the two studies. Some of these ions are known to play roles in fine-tuning the conformation of nucleic acids and/or proteins. Indeed, some can alter the elongation activity of pol II (41), and some can modulate the folding capacity of nucleic acids (42). The subtle effects of ions in the context of transcription merit further investigation, best pursued by single-molecule experiments to reveal energy landscape. Materials and Methods Construction of Yeast Strains and Protein Purification. RNA polymerase II. TAP-tagged yeast strains (Saccharomyces cerevisiae) were generated according to standard procedures (6, 43). Yeast cells expressing TAP-tagged Rpb3 or Rpb4 were grown and fractionated as described (6, 43) except a washing step with high concentration of potassium chloride (6, 44) was recruited to deplete TFIIF from pol II before the elution by Tobacco Etch Virus enzyme cleavage.

Chen et al.

CaM. To make a dye-CaM, a human CaM II (hCaM) was cloned into a pET22b vector (Novagen), in which the C terminus was His6-tagged, and the 3rd amino acid, aspartic acid, was mutated to a cysteine (45, 46) for maleimide-Cy3 conjugation. The cysteine mutant of the hCaM was overexpressed in E. coli. strain BL21 (DE3), purified with a Ni–nitrilotriacetic acid affinity column (Qiagen), and labeled with maleimide-Cy3 according to a standard protocol (GE Healthcare). In-gel FRET Measurements and Data Reduction. Pol II labeled with Cy3-CaM was separated from free Cy3-CaM based on molecular mass in a native gel made with gradient polyacrylamide (4 –20%; Invitrogen). Likewise, DNA or RNA bound to pol II was separated from free DNA or RNA, in the low-molecular mass band. Electrophoresis was performed in TBE buffer containing 3 mM Ca2⫹ at 120 V for 1.5 h. The wet gels were immediately scanned in a Typhoon 9400 scanner (GE Healthcare) equipped with a 532-nm laser for Cy3 excitation and a 580-nm emission filter to collect Cy3 fluorescence. The fluorescence intensities of the complex bands were integrated, and background was subtracted by using ImageQuant TL software. The accuracy of pipetting and loading among three lanes in the ‘‘triplet’’ (see SI Materials and Methods) was critical and monitored by the fluorescence intensities of the unbound Cy3CaM band in the low-molecular mass region: any gel containing a triplet in which the free Cy3-CaM in the left (donor only) and that in the middle lane (donor and acceptor) differed ⬎4% was discarded. The efficiency of energy transfer, E, was determined from the extent of Cy3 quenched by Cy5 in the double-labeled complexes compared with donor only complexes. We determined the ‘‘raw’’ energy transfer efficiency ER in-gel (47) according to Eq. 1,

ER ⫽

I D ⫺ I DA ID

[1]

where ID and IDA were the intensities of the donor-only (no energy transfer) and the donor and acceptor complexes, respectively. ER, the raw energy transfer efficiency, was converted to EA, the ‘‘authentic’’ energy transfer efficiency, according to Eq. 2,

EA ⫽

ER f

[2]

where f is the effective acceptor-labeling efficiency (see Table S3). fB, the fraction of pol II that contains RNA, namely binding efficiency of RNA to pol II, was determined according to Cy5 intensities in the upper and the lower bands according to Eq. 3,

fB ⫽

u I Cy5 u I Cy5

l ⫹ I Cy5

[3]

u l where ICy5 is the Cy5 signal appearing in the upper band, and ICy5 is the Cy5 signal appearing in the lower band. The effective acceptor labeling efficiency f could be obtained according to Eq. 4,

f ⫽ fB ⴱ f OA

[4]

where fOA was the optically active fraction of RNA. Single-Molecule FRET Measurements. A total internal reflection fluorescence microscope was built for single-molecule imaging (SI Materials and Methods) (46, 48) and quantum efficiency-calibrated (Fig. S5). Preformed elongation complexes consisting of Cy5-RNA, incubated with 5-fold excess of Cy3-CaM and diluted to 10 pM in transcription reaction buffer containing an imaging mixture, were immobilized on the functionalized slide surface through the biotinylated DNA template. The imaging mixture contained 10 mM Tris acetate (pH 7.5), 0.4% glucose, 2 mM Trolox (Fluka), 0.1 g/mL glucose oxidase (Sigma), and 0.02 mg/mL catalase, to remove solution oxygen and reduce blinking (49). Data were obtained with an alternative excitation sequence: 633 nm on (5 s) off, 532 nm on (400 s) off, with 0.3- to 0.5-s exposure per frame. Measurements were performed at 25 °C. Candidate molecules that were Cy3–Cy5 dual-labeled were selected based on Cy3 spot centroid colocalizing with Cy5 found through preexcitation in the first 10 frames. The simultaneous time trajectories were extracted from the same set of centroids and screened by checking for concomitant increase in Cy3 signal with Cy5 decrease or concomitant Cy5 signal decrease with Cy3 photobleaching. The jiggling of the spot positions between frames was fixed by extracting intensities from new centroids in the vicinity of the centroids in the previous frame. Donor and acceptor signals were converted to FRET efficiency according to Eq. 5, where PNAS 兩 January 6, 2009 兩 vol. 106 兩 no. 1 兩 131

BIOPHYSICS

N-terminal segment of TFIIB, termed the ‘‘B finger,’’ reaches into the catalytic center of pol II exactly through a groove (31–34), once presumed to be the RNA exit channel and proved to be true in this study. It is thus evident that the B finger of TFIIB will run into the advancing RNA. If the TFIIB overcomes the RNA, the initiation will be aborted; if the RNA continues to advance, pol II will escape from the promoter. By comparing this scenario with that in E. coli transcription, a conserved strategy is found in E. coli: the nascent RNA pushes away a protein linker between domains 3 and 4 of the ␴ factor that preoccupies the RNA exit channel on RNA polymerase so that transit from initiation to elongation may occur (35, 36). As the RNA extends to 26 nt, it becomes more flexible yet continues to reach out for Rpb7. Why would RNA take the route via the subcomplex Rpb4–Rpb7? Perhaps Rpb4–Rpb7 serves as a scaffold to arrange a meeting between the nascent mRNA of ⬇30 nt and its 5⬘ end-capping machinery (37). The latter is known to be recruited by the phosphorylated form of the CTD of Rpb1 subunit, a domain residing underneath Rpb4–Rpb7, so that transcription and mRNA processing can be efficiently coupled.

IA and ID denoted the background-subtracted fluorescence intensities of the acceptor and the donor, respectively (␤ is the leakage of Cy3 signal into Cy5 channel; ␥ is the ratio of quantum efficiencies of the two channels). Distances were calculated from FRET efficiencies according to Eq. 6. Data collection was conducted with Andor software, and subsequent processing was conducted with customer-written IDL programs (IDL 6.3; ITT).

E⫽

IA ⫺ ␤ I D IA ⫹ ␥ID

[5]

1⫺E E

[6]

R ⫽ R0

冉 冊

distribution were used to calculate distances. The distance of Rpb3–GE2 (Rpb4 –GE2) in pol II elongation complex crystal structure [Protein Data Bank (PDB) ID code 1Y1W] was used to determine the Fo¨rster distance, by which the distances of Rpb3–GE9 (Rpb4 –GE9) and Rpb3–GE18 (Rpb3–GE18) were derived. Triangulation was performed by searching atoms in the PDB (1Y1W) (10) that satisfied the distances with a given error. Such a ‘‘closure-error triangulation’’ scheme selected atoms within the intersection of two shells: vertex 1, distance from vertex 1, errors in distance; and vertex 2, distance from vertex 2, errors in distance. In the present work, vertex 1 was the last atom in the C terminus of Rpb3, and vertex 2 that of Rpb4.

1/6

Search of the Site in the Crystal Structure by Triangulation. Averaged authentic in-gel FRET or single-molecule FRET efficiencies at the peaks in the major

ACKNOWLEDGMENTS. We thank M.-J. Wang and Dr. Joan Chen [Institute of Biomedical Sciences, Academia Sinica (AS)] for providing the human CaM II clone; Dr. Yu-Ju Chen (Institute of Chemistry, AS), for mass spectroscopy; Tommy Setiawan and Y.-P. Weng in the Chang laboratory for CaM expression and purification. We are also grateful for critical discussions with Dr. Sunney Chan (AS and Caltech), Dr. David Bushnell (Stanford), and Prof. Averell Gnatt (University of Maryland, Baltimore). Dr. Chin-Yu Chen has been supported by a National Science Council postdoctoral fellowship and is currently supported by an AS postdoctoral fellowship. This work was supported by AS Grants AS95IC1 and AS-95-TPB06 and National Science Council of Taiwan Grants NSC94-2113-M-001-015, NSC95-2113-M-001-031, and NSC94-2627-B-001-003 (all to W.-H.C.).

1. Kornberg RD (2007) The molecular basis of eukaryotic transcription. Proc Natl Acad Sci USA 104:12955–12961. 2. Cramer P, et al. (2000) Architecture of RNA polymerase II and implications for the transcription mechanism. Science 288:640 – 649. 3. Cramer P, Bushnell DA, Kornberg RD (2001) Structural basis of transcription: RNA polymerase II at 2.8 Å resolution. Science 292:1863–1876. 4. Gnatt AL, Cramer P, Fu J, Bushnell DA, Kornberg RD (2001) Structural basis of transcription: An RNA polymerase II elongation complex at 3.3 Å resolution. Science 292:1876 –1882. 5. Westover KD, Bushnell DA, Kornberg RD (2004) Structural basis of transcription: Separation of RNA from DNA by RNA polymerase II. Science 303:1014 –1016. 6. Bushnell DA, Kornberg RD (2003) Complete, 12-subunit RNA polymerase II at 4.1- Å resolution: Implications for the initiation of transcription. Proc Natl Acad Sci USA 100:6969 – 6973. 7. Craighead JL, Chang WH, Asturias FJ (2002) Structure of yeast RNA polymerase II in solution: Implications for enzyme regulation and interaction with promoter DNA. Structure 10:1117–1125. 8. Ujvari A, Luse DS (2006) RNA emerging from the active site of RNA polymerase II interacts with the Rpb7 subunit. Nat Struct Mol Biol 13:49 –54. 9. Edwards AM, Kane CM, Young RA, Kornberg RD (1991) Two dissociable subunits of yeast RNA polymerase II stimulate the initiation of transcription at a promoter in vitro. J Biol Chem 266:71–75. 10. Kettenberger H, Armache KJ, Cramer P (2004) Complete RNA polymerase II elongation complex structure and its interactions with NTP and TFIIS. Mol Cell 16:955–965. 11. Stryer L, Haugland RP (1967) Energy transfer: A spectroscopic ruler. Proc Natl Acad Sci USA 58:719 –726. 12. Hillisch A, Lorenz M, Diekmann S (2001) Recent advances in FRET: Distance determination in protein–DNA complexes. Curr Opin Struct Biol 11:201–207. 13. Kerppola TK (2001) The bright future of fluorescence. Methods 25:1–3. 14. Ramirez-Carrozzi V, Kerppola T (2001) Gel-based fluorescence resonance energy transfer (gelFRET) analysis of nucleoprotein complex architecture. Methods 25:31– 43. 15. Kapanidis AN, et al. (2006) Initial transcription by RNA polymerase proceeds through a DNA-scrunching mechanism. Science 314:1144 –1147. 16. Margeat E, et al. (2006) Direct observation of abortive initiation and promoter escape within single immobilized transcription complexes. Biophys J 90:1419 –1431. 17. Kapanidis AN, et al. (2005) Retention of transcription initiation factor ␴70 in transcription elongation: Single-molecule analysis. Mol Cell 20:347–356. 18. Naryshkin N, Revyakin A, Kim Y, Mekler V, Ebright RH (2000) Structural organization of the RNA polymerase–promoter open complex. Cell 101:601– 611. 19. Rigaut G, et al. (1999) A generic protein purification method for protein complex characterization and proteome exploration. Nat Biotechnol 17:1030 –1032. 20. Puig O, et al. (2001) The tandem affinity purification (TAP) method: A general procedure of protein complex purification. Methods 24:218 –229. 21. Weiss S (2000) Measuring conformational dynamics of biomolecules by single molecule fluorescence spectroscopy. Nat Struct Biol 7:724 –729. 22. Ha T (2001) Single-molecule fluorescence resonance energy transfer. Methods 25:78 – 86. 23. Lue NF, et al. (2005) Telomerase can act as a template- and RNA-independent terminal transferase. Proc Natl Acad Sci USA 102:9778 –9783. 24. Li Y, Kornberg RD (1994) Interplay of positive and negative effectors in function of the C-terminal repeat domain of RNA polymerase II. Proc Natl Acad Sci USA 91:2362–2366. 25. Gileadi O, Feaver WJ, Kornberg RD (1992) Cloning of a subunit of yeast RNA polymerase II transcription factor b and CTD kinase. Science 257:1389 –1392. 26. Feaver WJ, Gileadi O, Li Y, Kornberg RD (1991) CTD kinase associated with yeast RNA polymerase II initiation factor b. Cell 67:1223–1230. 27. Norman DG, Grainger RJ, Uhrin D, Lilley DM (2000) Location of cyanine-3 on doublestranded DNA: Importance for fluorescence resonance energy transfer studies. Biochemistry 39:6317– 6324.

28. Iqbal A, et al. (2008) Orientation dependence in fluorescent energy transfer between Cy3 and Cy5 terminally attached to double-stranded nucleic acids. Proc Natl Acad Sci USA 105:11176 –11181. 29. Hohng S, Joo C, Ha T (2004) Single-molecule three-color FRET. Biophys J 87:1328 –1337. 30. Nudler E, Gusarov I, Avetissova E, Kozlov M, Goldfarb A (1998) Spatial organization of transcription elongation complex in Escherichia coli. Science 281:424 – 428. 31. Bushnell DA, Westover KD, Davis RE, Kornberg RD (2004) Structural basis of transcription: an RNA polymerase II-TFIIB cocrystal at 4.5 Å. Science 303:983–988. 32. Chen HT, Hahn S (2004) Mapping the location of TFIIB within the RNA polymerase II transcription preinitiation complex: A model for the structure of the PIC. Cell 119:169 – 180. 33. Chen HT, Hahn S (2003) Binding of TFIIB to RNA polymerase II: Mapping the binding site for the TFIIB zinc ribbon domain within the preinitiation complex. Mol Cell 12:437– 447. 34. Chen HT, Warfield L, Hahn S (2007) The positions of TFIIF and TFIIE in the RNA polymerase II transcription preinitiation complex. Nat Struct Mol Biol 14:696 –703. 35. Murakami KS, Masuda S, Campbell EA, Muzzin O, Darst SA (2002) Structural basis of transcription initiation: An RNA polymerase holoenzyme–DNA complex. Science 296:1285–1290. 36. Vassylyev DG, Vassylyeva MN, Perederina A, Tahirov TH, Artsimovitch I (2007) Structural basis for transcription elongation by bacterial RNA polymerase. Nature 448:157– 162. 37. Proudfoot NJ, Furger A, Dye MJ (2002) Integrating mRNA processing with transcription. Cell 108:501–512. 38. Andrecka J, et al. (2008) Single-molecule tracking of mRNA exiting from RNA polymerase II. Proc Natl Acad Sci USA 105:135–140. 39. Liphardt J, Onoa B, Smith SB, Tinoco IJ, Bustamante C (2001) Reversible unfolding of single RNA molecules by mechanical force. Science 292:733–737. 40. Abels JA, Moreno-Herrero F, van der Heijden T, Dekker C, Dekker NH (2005) Singlemolecule measurements of the persistence length of double-stranded RNA. Biophys J 88:2737–2744. 41. Gu W, Reines D (1995) Identification of a decay in transcription potential that results in elongation factor dependence of RNA polymerase II. J Biol Chem 270:11238 –11244. 42. Kim HD, et al. (2002) Mg2⫹-dependent conformational change of RNA studied by fluorescence correlation and FRET on immobilized single molecules. Proc Natl Acad Sci USA 99:4284 – 4289. 43. Chung WH, et al. (2003) RNA polymerase II/TFIIF structure and conserved organization of the initiation complex. Mol Cell 12:1003–1013. 44. Wade PA, et al. (1996) A novel collection of accessory factors associated with yeast RNA polymerase II. Protein Expr Purif 8:85–90. 45. Okten Z, Churchman LS, Rock RS, Spudich JA (2004) Myosin VI walks hand-over-hand along actin. Nat Struct Mol Biol 11:884 – 887. 46. Churchman LS, Okten Z, Rock RS, Dawson JF, Spudich JA (2005) Single molecule high-resolution colocalization of Cy3 and Cy5 attached to macromolecules measures intramolecular distances through time. Proc Natl Acad Sci USA 102:1419 –1423. 47. Radman-Livaja M, Biswas T, Mierke D, Landy A (2005) Architecture of recombination intermediates visualized by in-gel FRET of ␭ integrase–Holliday junction–arm DNA complexes. Proc Natl Acad Sci USA 102:3913–3920. 48. Ha T, et al. (1999) Ligand-induced conformational changes observed in single RNA molecules. Proc Natl Acad Sci USA 96:9077–9082. 49. Rasnik I, McKinney SA, Ha T (2006) Nonblinking and long-lasting single-molecule fluorescence imaging. Nat Methods 3:891– 893. 50. DeLano WL (2002) The PyMOL Molecular Graphics System (DeLano Scientific, Palo Alto, CA). 51. Jones TA, Zou JY, Cowan SW, Kjeldgaard M (1991) Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Crystallogr A 47:110 –119.

Various pol II elongation complexes were examined: Rpb3–GE2, Rpb3–GE9, Rpb3–GE18, Rpb4 –GE2, Rpb4 –GE9, Rpb4 –GE18. From ⬇50 to 100 molecules, a histogram of corrected FRET efficiencies was constructed by including all data points of each the time trace of each molecule until Cy5 was bleached or Cy3-Cy5 codisappeared, and fitted with multiple Gaussian distributions.

132 兩 www.pnas.org兾cgi兾doi兾10.1073兾pnas.0811689106

Chen et al.