DNA Functional Groups Required for Formation of ... - Semantic Scholar

1 downloads 0 Views 3MB Size Report
GM21120 (to M. H. C.), an Upjohn Graduate Fellowship (to J. W.. D.), and National Institutes of ... tory, Upton, Long Island, NY 11973. 5 Current address: Dept. of ...
Vol. 262, No. 2, Issue of January 15, pp. 892-898,1987 Printed in U.S.A.

THEJOURNAL OF BIOLOGICAL CHEMISTRY 0 1987 by The American Society of Biological Chemists, Inc.

DNA Functional Groups Requiredfor Formation of Open Complexes between Escherichia coli RNA Polymerase and the X PRPromoter IDENTIFICATION VIA BASE ANALOG SUBSTITUTIONS* (Received for publication, June 20, 1986)

John W. DubendorffS, Pieter L. deHaseth8, MaryS. RosendahlV, and Marvin H. CarutherslJ From the Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado 80309-0215

Synthetic 75-base pair promoters bearing base changes and/or base analog substitutions at selected positions were constructed. Using both abortive initiation and run-off transcription assays, the interaction of these altered promoters with Escherichia coli RNA polymerase was studied in order to determine the involvement of DNA functional groups in promoter recognition. Two adjacent thymines in the -35 region were identified whose 5-methyl groups play a crucial role. Additionally, the combined results from several substitution experiments showed that functional groups in the major groove of the strongly conserved T-A base pair at the -7 position are probable sites of direct interaction with RNA polymerase.

In an attempt to better define transcription at the functional group level, we have introduced specific base modifications into the DNA of the bacteriophage X PR promoter deox(Fig. 1).By substituting deoxyuridine for thymidine and yinosine for deoxyguanosine, the 5-methyl group on thymine and the 2-aminogroup on guaninewere removed and thereby tested as contact sites between E. coli polymerase and DNA. This approach, called functional group mutagenesis, has the advantage of testing one functional group without affecting others as would be the case if base pair transversions and transitions were studied. These modified promoters were assembledenzymatically from chemically synthesizedDNA fragments using T 4 DNA ligase. In order to understand this system and to establisha base line performance in our laboratory, we initially constructed and measured the activityof a synthetic, unmodified X PRpromoter (6). We also showed The specific interaction of RNA polymerase with promoter in a preliminaryreport (7) thatnosignificant effect on DNA is an important step inexpression the of bacterial genes. transcription activity was observed with certain analog subWhile sequence comparisons amongover 100 promoters read- stituted promoters. We have continued this work on the P R ily reveal substantial diversity, two regions of homology have promoter region and report here the identification of two emerged as well (1, 2) and are commonly referred to as the positions in the -35 region where methyl groups on thymine -35 (5’-TTGACA-3’) and-10 or Pribnowbox (5”TATAAT- appear critical to the interaction with RNA polymerase. Ad3‘) consensus sequences where these numbers indicate the ditionally, the combined results from several analog substiapproximate positions upstream from the start site of tran- tutions showed that functional groups in the major groove of scription. Specific sites of contact between RNA polymerase the strongly conserved T . A base pair at the -7 position are and promoter DNA have been localized to the DNA or at near probable sites of direct interaction with RNApolymerase. these regions of homology (3). In view of the sequence variation even in these regions, a subset of the chemical functionMATERIALS ANDMETHODS alities specified by these consensus sequences (or the compleEnzymes and Reagents-RNA polymerase was isolated from E. coli mentary sequences on the anti-sense strand) must be suffi- (grain processing) in the laboratory of Dr. Carol L. Cech (Department cient to enable polymerase to recognize the DNA as a pro- of Chemistry and Biochemistry, University of Colorado, Boulder) by moter. Extensive mutational analysis has shown that while the method of Burgess and Jendrisak (8). The activity of the various the strongest promoters seem to have better matches to the preparations used in this study ranged from 50 to 80% of total consensus sequence, the exclusion of a particular base pair at concentration as measured by theT7 template functional assay method of Chamberlin et al. (9). The RNA polymerase concentrations some sites is more critical (1,4,5). Thisproduces a picture of reported here are nominal values which are not corrected for the the transcription process in Escherichia coli which is both fraction of active enzyme. T4 polynucleotide kinase and DNA ligase complex and flexible. were from either Bethesda Research Laboratory or New England

* This work was supported by National Institutes of Health Grant GM21120 (to M. H. C.), an Upjohn Graduate Fellowship (to J. W. D.), and National Institutes of Health postdoctoral fellowships (to P. L. deH. and M. S. R.). This is Paper XXII in the series “Studies on Gene Control Regions.” The preceding paper is Mandecki, W., Goldman, R. A., Powell, B. s.,and Caruthers, M. H. (1985) J. Bacteriol. 164,1353. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solelyto indicate this fact. 2 Current address: Biology Dept., Brookhaven National Laboratory, Upton, Long Island, NY 11973. 5 Current address: Dept. of Biochemistry, School of Medicine, Case Western Reserve University, Cleveland, OH 44106. ll Current address: Synthetech Inc., Boulder, CO 80301. 11 To whom correspondence should be addressed.

Biolabs. Snake venom phosphodiesterase and CpA’ were fromSigma. Deoxynucleosides including 5-methyldeoxycytidine, deoxyuridine, and deoxyinosine were purchased from Sigma or Pharmacia P-L Biochemicals. Promoters-Oligodeoxynucleotides were synthesized using the phosphoramidite methodology (10-12) and purified by high performance liquid chromatography (13). The sequences of selected segments were confirmed by the electrophoresis homochromatography mobility shift analysis procedure (14, 15). Analog segments were analyzed by partial snake venom phosphodiesterase digestion and electrophoresis with a similarly treated sample of the unmodified segment as a marker. Oligodeoxynucleotidescomprising the 75-base pair (bp) PR promoter were covalently joined using T4 DNA ligase and isolated The abbreviations used are: CpA, cytidylyl (3’45’)adenosine; CpApU, cytidylyl (3’+5’)adenyly1(3’+5’)uridine; bp, base pair; PAGE, polyacrylamide gel electrophoresis.

892

Base Analog Substitutions in X PRPromoter by standard procedures (6,16). Synthetic duplexes were stored frozen in 10 mM Tris (pH 8), 50 mM KCl, 1 mM EDTA. Abortive Initiation Assays-The formation of functional or open RNA polymerase-promoter complexes was detected through their ability to carryout the reiterativesynthesis of the trinucleotide CpApU from CpA and UTP (17). The reactions were carried out as previously described for the wild-type 75-base pair syntheticpromoter (6). Briefly, association lag times (18) were obtained by incubating initiating dinucleotide (1mM), [a-"P]UTP 90.04 mM;20 Ci/mmol), and synthetic DNA template (1nM) in standard reaction buffer (40 mM Tris (pH 8), 100 mM KCl, 10 mM MgCL, 1 mM dithiothreitol) for 10 min at 37 "C. RNA polymerase was added to 10 nM at time 0. Aliquots (2.7 pl) were removed at specified times and spotted on paper (Whatmann 3") prestreaked with 0.1 M EDTA to quench the reaction. The CpApU product was separated from labeled UTP by ascending paper chromatography as described (19). The lag time, T, was determined using a linear regression analysis which included only points corresponding to times greater than three times the estimated T value. An anomalous dependence of T on the concentration of RNA polymerase, ascribed to end-binding of polymerase to the 75-bp syntheticpromoter (6, 20), prevented determination of kinetic parameters by generating T plots as described by McClure (18). A low RNA polymerase concentration (10 nM) was therefore used to minimize binding to DNA termini. Run-off Transcription Assay-The rate of functional complex formation was measured by the ability of complexes to synthesize RNA corresponding to theproperly initiated message which terminated at or near the end of the synthetic promoter. Reactions were initiated by mixing 8 p1 of RNA polymerase with 56 pl of binding reaction mixture containingsyntheticpromoter duplex. Both components were prewarmed at 37 "C before mixing. The final RNA polymerase and DNA concentrations were 67 nM and 3 to 5 nM, respectively, unless otherwise specified. After selected times, 7.5-pl aliquots were removed and added to 2.5 p1 of transcription mixture containing heparin (100 pg/ml final concentration), which binds RNA polymerase not present in functional complexes, and nucleotide triphosphates to allow transcription (open complexes at the X PR promoter are not sensitive to added heparin (17)).Binding and transcription reactions were performed in 30 mM Tris (pH 7.9), 100 mM KCI, 3 mM MgCI,, 0.1 mM EDTA, and 0.2 mM dithiothreitol. During the 20-min transcription reactions, ATP and CTPwere a t 200 p M and ~x-~'P-labeled GTP and UTP were at 2 p~ and 20 Ci/mmol. Reactions were terminated by freezing at -78 "C. Mixtures were dried in a SpeedVac and taken up in 5 p1 of 7 M urea, 1.5 mM GTP, 1.5 mM UTP, 0.1% bromphenol blue, and 0.1% xylene cyanol. Samples were electrophoresed on 20% polyacrylamide, 7 M urea gels until the xylene cyanol marker dye had runapproximately 12 cm. RNA wasvisualized

893

by autoradiography and the X PR-specific transcripts excised and counted for Cerenkov radiation. RESULTS

Promoter Constructs--X PRwas synthesized so as to contain base pair analogs (deoxyuridine, deoxyinosine, and 5-methyldeoxycytidine) as well as base pair changes at certain key locations that had been identified in chemical probing experiments to be sites of close contact withRNA polymerase (Fig. 1).These included base pairs in the -10 and -35 regions that show a high degree of sequence conservation among different promoters (1, 2), DNA within and just upstream of the -10 region that is apparently contactedby RNA polymerase (3), and the single-stranded domain as detected in functional RNA polymerase-promoter complexes (21). Some promoters bear substitutions a t multiple positions so that several functional groups couldbe checked simultaneously. Thelack of an effect on transcription for such multiple substitutions was strong evidence that none of the groups tested was important in forming a functional complex. Two promoters, both having the same size as the analog promoters, were constructed as positive and negative controls. These were the wild-type unmodified promoter and the transition T. A to C . G at position -7. The synthetic wild-type promoter had previously been shown to yield results identical to those obtained from the same promoter as part of a bacteriophage X restriction fragment (6). As a control for our abilityt o detect altered promoter activity, the -7 transition was used primarily because the analogous natural mutation at this highly conserved position has been shown to dramatically reduce promoterfunction (22-24). Generallypromoters were synthesized by enzymatically joining shortDNA segments containing15 t o 20 mononucleotides each. Thefully assembled promoterswere isolated from reactants by polyacrylamide gel electrophoresis (PAGE) undernondenaturingconditions. Homogeneitywas then checkedby PAGE under denaturing conditions where the individual 75 mononucleotide single strands were easily separated (Fig. 2). We estimate that promoters synthesized and purified by this procedurewere a t least 95% homogeneous.

14

I

T.A-CM.G

' B O T-U

T-U 1-7 T.A-C.G 18 T.A-C.1

FIG. 1. A summary of analog substitutions in the X PRpromoter. Starting with the initiation site for cro RNA as base pair +1, base pairs to the left and right are indicated as - and +, respectively. The -10 and -35 regions are marked by the shaded rectangles below the DNA sequence. Promoter sites substitutedwith analogs are designated as follows: deoxyinosine for deoxyguanosine, G+ deoxyuridine for thymidine T-U. Transitions from thymidine. deoxyadenosine to 5-methy1deoxycytidine.deoxyguanosine base pairs at positions -34 and -35 are designated by T . A-4". G . Transitions at position -7 from thymidine'deoxyadenosineto deoxycytidine.deoxyguanosine and deoxycytidine. deoxyinosine are designated T. A+C. G and T . A+C '1, respectively. Analog promoters are defined by numbers located either at the terminus of lines or inserted into lines leading to the analog abbreviation. These analog promoters are therefore abbreviated as P-1 through P-20 in the text, Tables, and Figures. Promoters designated 1, 2, 3, 7, 9, 10, 11, 12a, 12b, 13, 15, 17, 18, 19, and 20 are modified to contain only a single analog or base pair substitution. Promoters designated 4,5, 6, 8, 12, 14, and 16 contain two or three such substitutions.

894

Base Analog Substitutions in X P R Promoter

Comparison of Promoters Using Abortive Initiation-We -35 region are especially interesting. It can be seen thatsingle have investigated the kinetics of open complex formation with uracil substitutions reduce the promoter-dependent production of aborted productfrom the promoters to similar extents, the 75-bp synthetic duplexes using an abortive initiation assay in lag times.The doubly to detect formation of open complexes. Typical results with while leading to significant increases unmodified and selected analog promoters are shown inFig. uracil-substituted promoter shows negligible activity during 3. In most cases analog promoters behaved similarly to the the timeof our assay. unmodified 75-bp promoter. Results with P-3 and P-10 illusThe T values calculatedfrom these abortive initiation assays tratethisobservation.Lagtimeexperimentsforvariants are summarized in Table I. As shown before, the synthetic bearing single (P-12a and 12b) and double (P-12) substitu- wild-type promoter as a 75-bp duplex behaves similarly to a 890-base pair restriction fragment, tions of uracil for the two highly conserved thymines in the PRpromoter carried on an demonstrating thevalidity of our approach(6). In mostcases, these variant PR promoters, with analog substitutions at a 1 2 3 4 5 6 7 variety of positions, gave lag times of 1 to 3 (+1.0) min based on several determinations. We conclude that within experimental error these promotersshow similar behavior. Of particular interest was that substitution of uracil for thymine a t 0 all A . T base pairsbetween -15 and +5 and also substitution of inosine for guanine a t all but two G .C base pairs in the same region (sites a t -1 and +3 were not tested)did not alter TABLE I A summary of association 7 values obtained by the abortive initintwn assay on a m b g and unmodified X PR promoters Promotef

1 I

rb

Promoter

T

rnin

P-890 P-75 P-1 P-2 P-3 P-4 P-5 P-6 P-7 P-9 P-10

10

FIG. 3. Time courses of CpApU synthesis from synthetic promoters. 0 Percent of countsincorporatedinto k!a as a function of time. product are plotted Promoter and RNA polymerase concen- CL trations were 1 and 10 nM, respectively. a The numbers 3, 10, 12, 12a, 12b, 17, and 0 18 refer to promoter analogsor base pair changes as defined in the legend to Fig. ,a 1. Since promoters P-l2a, P-12b and also R P-17, P-18 show, as pairs, identical abor- c tive initiation results, only one symbol z w V is used in each case toshow the data. Pa 890 is the 890-bp HaeIII restriction fragE ment of phage X and contains the entire rightward control region. The 75-bp unmodified promoter is abbreviated as P75.

-

9-

8 -

P-l20,b

x

? -

y

6 -

,

,

P-890 P-75 P-IO P-12 P-3 P-17,18

2.8 9 1.9 NA 2.1 3.0 NA NA 1.7 3.7

'P-890, a nonsynthetic, naturally occurring 890-bp HaeIh derived duplex from phage X that contains thePR promoter; P-75; a synthetic 75-bp duplexcontaining X PR promoter; P-1 to P-20,various modified promoters as defined in thelegend to Fig. 1. The association T values or lag times were calculated from abortive initiation results. NA is an abbreviation for not active. Thesepromoters were inactive in the abortive initiationassay.

FIG. 2. PAGE analysis on denaturing gels of purified promoters. Promoters isolated by PAGE under nondenaturing conditions from reaction mixtures were then analyzed by PAGE under denaturing conditions on a 20% acrylamide, 7 M urea gel. '"P was from internal sites generated by enzymatically joining5'-"P-labeled deoxyoligonucleotides to form thefinalpromoters. Lanes I to 7 contain P-75 (the synthetic, unmodified duplex), P-19, P-20, P-15, P-16, P-13, and P-11, respectively. 1

P-11 P-12a P-12b P-12 P-13 P-14 P-15 P-16 P-17 P-18 P-19 P-20

1.3 91.7 1.5 2.5 NA' 2.9 1.9 3.0 1.8 0.6 1.6 1.4 2.0

P-8

1

rnin

I

1

I

,

,

,

1

I

I

I

1

,

0

0s

0 0

0

0

x

-

m 0

0

0

0

a

0

6

0

-

n

-

0

5 -

0

0 4 -

0

0

O

3-

O

O

0

E

0

0

"

4

B

J

0 *

O t Y

2 -

O

I -

-LL-"~-L~ 4 a a 6

9

12

I5

18

21

24

27

TIME (mln)

e

*

9

?

*

4

* * * *

B

8

3

* *

i 4 30

33

36

+

39 4 5 4 2

-

e 48

51

Base Analog Substitutions in X PR Promoter

895

the rateof transcription initiation. These 20 base pairsinclude showed that transcription is initiated at the correct site and proceeds in the right direction (6). A typical experiment is the -10 region, the transcription initiation site, and the DNA region generally considered to be unwound during formation presented in Fig. 4. At each time point, the number of productive complexes present is related to the quantity of RNA of a t least some open promoter complexes (3). synthesized. In addition to transcripts of the expected size (21 There were, however, several promoters where alterations in sequence or insertion of analogs caused an increase inlag nucleotides), smaller transcripts were also observed. These smaller transcripts(18to 20 nucleotides) probably result from times. Substitution of uracil for thymine a t either -34 or -35 of the promoter changes 7 from 1.7 min for theunmodified promoter to 9 min polymerases which have paused near the end or morefor each of thesinglesubstituted analogs. This duplex. The group of four discrete transcripts representing indicates a 4 to 5-fold reduction in promoter strength. Inser- complete or almost complete elongation to the end of the tion of uracil a t both -34 and -35 generated a very weak fragment was excised to determinepromoter-specific complex promoter whose 7 value could not be measured. Several ad- formation. A plot of the radioactivity in therun-off transcript ditional promoters displayed negligible activity in the assay; uersw time of incubation of RNA polymerasewith DNA gives all bear substitutions a t either the -7 position or the two an indicationof the rateof promoter saturation(Fig. 5). With conserved thymines in the -35 region. These include the the majority of templates, including the wild-type, half-times variant P-17 (C.G for T - A at -7) confirming our ability to for promoter saturation of 4.0 +. 0.5 min were found (Table detect naturally occurringsevere "down" mutations and also 11) which corresponds to a 7 value of 5.8 min. This is longer than the time obtained from the lag assays described in the the substitution of the analog base pair C . I at the same position. In addition, the double substitution of C".G for previous section but is consistent with the valueexpected from the higher RNA polymerase concentration used in the T -A in the -35 region resulted in severe loss of activity. Comparison of Promoters Using the Run-off Transcription run-off transcription assay (67 uersw 10 nM in the abortive initiation assays). For all analogs, the results in the run-off Assay-Promoters were alsotested for activityusingthe (25). Unlike transcription assay paralleled those obtained in the abortive productiverateassay of Stefan0andGralla abortive initiation, this assay uses the productionof run-off RNA as a reflection of promoter occupancy through the time course of the bindingreaction (see "Materials and Methods"). Our previous characterization of the synthesized promoters

O.5

1

2

5

10

20

30

60

-m".oooa

'y II 2

.

.

5

10

.

.

.

10

JO

TIME

.

.

40

.

. lo .

'w I

trnin1

FIG.5. Comparison of promoters by initial rate of complex formation with RNA polymerase. For each time point, the four discrete transcripts representing complete or almost complete elongation were excised as gel slices and counted. Complete assay conditions aredescribed under "Materials and Methods" and thelegend to Fig. 4 . 0 , wild-type promoter; W, P-3; +, P-16. TABLE I1 A summnry of half-times of promoter saturation a9 obtained from run-off transcription assays The promoter abbreviations aredefined in Table I and the legend to Fig. 1. Half-times for promoter saturation were completed a t 67 nM E. coli RNA polymerase. Values in parentheses were from expercoli RNA-polymerase. iments completed a t 120 nM-E.-~ -~- - . ~~

Promoter ~~

(15) ~

3.6 4.2 4.5

P-9 P-10 P-11 P-12a

;: ~~

fH

rnin 3.8 (5)

P-75 P-1 P-2 P-3 P-4

FIG. 4. Gel analysis of complex formation between RNA polymerase and wild-type promoter. P-75 promoter DNA was a t 5 nM. Numbers above each lane indicate the time in minutes after addition of RNA polymerase (final concentration 67nM). The arrows bracket the location of run-off RNA. Complete assay conditions are under "Materials andMethods." Identical resultswere obtained when [-y-"'P]ATP was substituted for [~r-~~PItriphosphates(datanot shown).

~~

~

Promoter

~

tu

~

~

11

rnin

P-12b P-12 P-13 P-14 P-15

;P-1s :; P-19 P-20

(15) NA 3.8 NA 4.1 4.0 NA NA 3.4 3.4

NA is an abbreviation for not active. These promoters had less than 10% of the wild-type promoter activity (P-75).

Analog Base

896 0.5

1

2

5

10

Substitutions in X PR Promoter 30

20

60

A“”””

L

1 B

c

C

c

moter, gave lag times of 1 to 3 min showed kinetic behavior similar to thatof the wild-type promoter in this assay. Due to theirreduced activity, the promoters with the single T to U substitutions were assayed a t 120 nM RNA polymerase concentration in an attempt to increase the measurable signal. From Figs. 6 and 7 and Table I1 it can be seen that these promoters (P-12a and P-12b) show similar kinetics of open complex formation, with both significantly slower than the wild-type promoter assayed under the same conditions. The relative effect of the single uridine substitutions asjudged by this assay is similar to that seen using the abortive initiation reaction. Transcription from End-labeled Promoters-Promoters P12, P-14, P-17, and P-18were inactive in both assays. In order to confirm that these promoters were indeed inactive and that the observed lack of transcription was not due to promoter loss or breakdown during purification or storage, run-off transcriptionassays were carriedout with 5‘ end-labeled promoters (26). After isolation by denaturing gel electrophoresis and precipitation, theywere incubated with 40 nM RNA polymerase for 30 min. Transcription was then carried out for 30 min as described above. The promoters were resolved from their transcription products by electrophoresis on 20% acrylamide, 7 M urea gels. Autoradiography allowed visualization of the labeled promoter DNA and the transcribed RNA. The bands were excised and counted, allowing a comparison of the activity of various promoters from the ratios of counts in the RNA and the DNA. Using this protocol, promoter P10 was found to direct the synthesisof greater than 10 times the number of RNA chains as did P-12, P-14, P-17, or P-18 (on a mole basis), confirming theobservation that the latter four promoters have greatly reduced activity (data not shown). DISCUSSION

We have described experiments which, for the first time, FIG. 6. Comparison of wild-type, P-12a. and P-12b by initial rate of complex formation with RNA polymerase. DNA identify promoter functional groups that affect transcription and RNA polymerase concentrations were15 and 120 nM, respec- by RNA polymerase. Two methyl groups on adjacent thytively. Complete assay conditionsare under “Materials and Methods.” mines a t -34 and -35 were found to be crucial to promoter Gel patterns of the time course (0.5 to 60 min)of complex formation recognition. Removal of either methyl group via substitution using wild-type, 12a, and 12b promoters are shown. Arrows bracket of uracil for thymine led to a 4- to 5-fold reduction inthe rate the location of the run-off RNA. Panels A, R, and C show the reactions complex using P-75, P-l2a, and P-l2b, respectively. Gels were analyzed as of formation of a transcriptionallycompetent whereas substitution of uracil a t both sites generated an described in the legend to Fig. 4. inactive promoter. These results are quite surprising sinceX I

1 6

n

90

6 . 3

x)

TIME

I 1 0

fminl

FIG.7. Time course of complex formation between RNA polymerase and P-75. P-12a. or P-12b. Gels shown in Fig. 6 were analyzed as outlined under “Materials and Methods.” 0, P-75; 0, P-12a; +, P-12b. Curves corresponding to results with P-75 (-) and also P-12a plus P-12b (- - - -) are shown.

initiation assay. Thus certain promoters which showed less than 10% of wild-type activity (P-12, -14, -17, and -18) by this assay were also inactive in the abortive initiation assay. Similarly the same templates that, like the wild-type pro-

PHis classified as a strong promoter (27). However, previous research has shown that substitutionof uracil for thymine at a singlesite in the lac operator (i.e. removal of a methyl group) can increase thefree energy of binding tolac repressor by 1.5 kcal/mol (28). If removal of methyl groups at -34 and -35 of PR have an equivalent effect on RNA polymerase binding, then the 3 kcal/mol binding freeenergy would indeed be significant and translate into approximately a 100-fold reduction in affinity of E. coli RNA polymerase for the modified PRpromoter (P-12). Unfortunately, we have not been able to determine the initial binding constant ( K h )and isomerization rate constant ( k p )for the various 75-base pair synthetic promoters (see “Materials and Methods” and also Ref. 6). As a result, it is unknown which step of open complex formation is affected by the modifications described here. It is conceivable that the methylgroups at -34 and -35 interact specifically with hydrophobic amino acid side chain(s) in such a way that the helix is destabilized to yield a lowered melting temperature. In this way, the thymine methyl groups may contribute to the formation of both open and closed promoter complexes. Alternatively, the methyl groups may serve as a recognition element only during closed complex formation and facilitate the further interaction of RNA polymerase with the promoter through additional contacts which lead to strand

in X PRPromoter

Substitutions Analog Base separation and open complex formation (29, 30). Our results do not distinguish between these possibilities. We do, however, propose that RNA polymerase through hydrophobic interactions recognizes specifically the thymine 5-methyl groups at -34 and -35 irrespective ofhow these recognition events contribute to the steps leading to transcription. Analogous experiments with lac operators substituted with 5-bromodeoxyuridine, 5-bromodeoxycytidine, and 5 methyldeoxycytidine have shown that lac repressor can recognize substituents at the5-position of pyrimidines (31, 32). The alternative explanation is that insertion of uracil would distort the promoterconformationin the -35 region and thereby negatively affect the formation of a functional polymerase-promoter complex. This explanation seems unlikely since recent experiments with defined sequence deoxyoligonucleotides have shown that substitutionof uracil for thymine did not affect the global structure of DNA (33). The only structural variation detected was a small change (estimated at 10" to 20") in the base orientations around the N-glycosidic bonds (x angles). If these results can be applied directly to promoters, such small perturbations in the absence of any other detectable conformation distortions would seem to be incapable of leading to the large alterations inpromoter activity observed at -34 and -35. Moreover, substitutions of uracil for thymine elsewhere, including multiple substitutions in P-4 and P-16, had a negligible effect on transcription. If uracil substitution grossly distorted the promoter DNA conformation, then at least some of these other uracil analogs should also have exhibited reduced activity. This was not the case. The significance of thymine 5-methyl groups as polymerase contact sites in the -35 region is consistent with current data. As summarized in Fig. 8, chemical modification and photocross-linking have helped to define the sitesof close approach between E. coli RNA polymerase and several of its promoters (3, 34). These earlier observations suggested that in the -35 region, RNA polymerase recognizes an E. coli promoter through contacts in the major groove of DNA. Methyl contact sites as suggested by the experiments described here are superimposed on these previously published results. As can be seen by inspection of this data, these methyl groups are positioned precisely in the region where RNA polymerase has been shown to contactDNA. Moreover, through substitution -16 raalon

-35 rbalon

Front

Bock

t

I

t Top strand

t

I

897

of these methyl groups with bromine, photochemical crosslinking experiments have shown that these sites are in close proximity to thepolymerase molecule (35) and furthersupport our conclusion that RNA polymerase contacts the promoter through direct hydrophobic interactions at positions -34 and -35.Of particular note is the observation that removal of either methyl group by introduction of a single uridine results ina similar reduction of therate of complex formation. Viewed from another perspective, this implies that the presence of a single methyl group at either site allows complex formation to proceed, albeit at approximately one-fourth to one-fifth the normal rate. Inspection of over 100 E. coli promoter sequences shows a high conservation of thymine at -34 and -35 with nearly all promoters having at least one and the stronger promoters retaining thymine at both positions (1, 36). Among the sequenced promoters (1, 2), those lacking thymine at -34 and -35 are generally considered to be transcriptionally the least active. Evidently a range of hydrophobicity (0, 1-,or %methyl groups) is discernable during recognition of the promoter by RNA polymerase. Apparently RNA polymerase can tolerateacertain amount of variability in themolecular surface of the promoters to which it binds. This is not surprising in view of the great variation in sequence and wide spectrum of regulatory characteristics of E. coli promoters (36, 37). Promoterscontaining 5-methyldeoxycytidine and deoxycytidine at positions -34 and -35 showed little activity. This result indicates that although the methyl groups appear to be recognized by polymerase, some additional characteristics of T.A base pairs in the -35 region are necessary for promoter function. Several possibilities are immediately apparent. It is conceivable that while the methyl groups are important to the kinetic recognition of the DNA as a promoter,some hydrogen bonding pattern recognition (38) is required for functional complex formation. For example, perhaps the exocyclic amino group onadenine andthe thymine 4-carbonyl interact through hydrogen bonding with RNA polymerase. Another possibility is thatthe two additional hydrogen bonds in C". G rather than T . A base pairs greatly affect the local melting or unwinding of the promoter during transcription. Alternatively, the introduction of C".G base pairs at these positions might disturb the local DNA structure sufficiently so as to prevent productive binding of RNA polymerase to the promoter. Perturbations of DNA structure by cross-chain clashes of guanines on opposite strands have been described (39). Replacement of two T.A base pairs in the -35 region by C.G base pairs introduces the possibility for just such unfavorable contacts. The ambiguity as to theexplanation for the observed results with these substitutions might be resolved by synthesis of promoters containing either (2.1or C". I base pairs at these two positions.

Prlbnor Bo;

Boffom aYrand

FIG.8. Planar representation of RNA polymerase contacts with the T7 A3 and lac UV5 promoters (modified from Ref. 3). Polymerase contacts with the T7 A3 and lac UV5 promoters are superimposed on a cylindrical projection of the DNA helix. The distance between the Pribnow box and the -35 region is that of the T7 A3 promoter. Contact regions likely to interact with polymerase are shown as shaded areas. These shaded areas are located primarily on the front face in the -35 region and on both faces in a region from approximately -16 through the Pribnow box. These regions were defined from chemical protection experiments (3). The two filled circles represent the methyl groups whose removal (by introduction of uridine at these sites) affects binding of RNA polymerase. Substitution of a C. I base pair for the normally occurring T . A base pair at the -7 position (filled i n section of the Pribnow box region) indicates that the central major groove is recognized by RNA polymerase at this highly conserved site.

e

RNA POLYME?

FIG.9. Possible mechanism for denaturation of promoters. B: His, Tyr, Ser, Thr, Lys, or Arg. The -C02Hgroup could be either Asp or Glu. This reaction could represent the nucleation of strand separation presented by Buc and McClure (40).

898

Base Analog Substitutions in X PR Promoter

Base analog substitutions in the -10 region that test for base pairs or individual bases through other functional groups recognition of the thymine 5-methyl and m guanine 2-amino requires two contacts (45). groups have a negligible effect on transcription. However, Acknowledgments-We thank Jeri Beltman for excellent technical other changes in this region have dramatic effects: substitu- assistance and David Auble for critically reading the manuscript. tion of a C G base pair for the T. A base pair at -7 leads to REFERENCES a promoter lacking measurable activity, as does the introduc1. Hawley, D. K., and McClure, W. R. (1983) Nucleic Acids Res. 11, 22372255 tion of a C .I base pair. On a functional group level, the C -1 2. Rosenberg, M., and Court, D. (1979) Annu. Reu. Genet. 13, 319-353 analog base pair presents a minor groove identical to that of 3. Siebenlist, U., Simpson, R. B., and Gilbert, W. (1980) Cell 20, 269-281 4. von Hippel, P. H., Bear, D. G., Winter, R. B., and Ber 0.G. (1982) in the T.A base pair found normally at -7. The differences Promoters: Structure and Function(Rodriguez, R., anfchamberlin, M., between the C.1 and T.Abase pairs exist exclusively in the e&) pp. 3-33, Praeger Scientific, Praeger Publishers, New York 5. Youderian, P., Bouvier, S., and Susskind, M. M. (1982) Cell 30, 843-853 major groove. Since removal of the methyl group from thy6. deHaseth, P. ,L., Goldman, R. A,, Cech, C. L., and Caruthers, M. H. (1983) mine (P-3) at this position results in no changein thekinetics Nuclezc Actds Res. 11, 773-787 (Figs. 3 and 5, Table I), the altered locations of the exocyclic 7. Caruthers, M. H., Beaucage, S. L., Efcavitch, J. W., Fisher, E.F., Goldman, R. A,, deHaseth,,P. L., Mandecki, W.? Matteucci, M. D., Rosendflhl, M. amino and carbonyl groups in the major groove (for C.1 or S., and Stahinskl, Y. (1983) Cold Sprzng Harbor Symp. Quant. Bml. 47, C - G relative to T. A) must be responsible for the observed 411-418 8. Burgess, R. R., and Jendrisak,J. J. (1975) Biochemistry 14,4634-4638 lack of activity. Additionally absence of a minor groove amino 9. Cbamberlin, M. J., Nierman, W. C., Wiggs, J., and Neff, N. (1979) J. Biol. group on inosine excludes alteration of the local helix strucChem. 254, 10061-10069 M. H. (1982) in Chemical and Enzymatic Synthesis of Gene ture (39) as an explanation for this result. By elimination, 10. Caruthers, Fragments, A Laboratory Manual (Gassen, H. G., and Lang, A., eds) pp. 71-79, Verlag-Chemie, Weinheim, Federal Republic of Germany therefore, we arrive at theconclusion that at the-7 position A. D., Tang, J.-Y., and Caruthers, M. H. (1984) Nucleic Acids Res. RNA polymerase recognizes some or all of the functional 11. Barone, 1 3 A n, 51-AM1 ~ groups as present in the central part of the major groove of a 12. Caruthers, M. H., McBride, L. J., Bracco, L. P., and Dubendorff, J. W. (1985) Nucleosides & Nucleotides 4,95-105 T . A base pair. This is consistent with chemical and physical 13. Matteucci, M. D., and Caruthers, M. H. (1981) J. Am. Chem. Soc. 103. 3185-3191 modification data suggesting interaction with polymerase al14. Bambara, R., E., and Wu, R. (1974) Nucleic Acids Res. 1, 1503-1520 most exclusively in the major groove of the Pribnow box 15. Tu, C., Jay, E.,Jay, Bahl, C., and Wu, R. (1976) Anal. Biochem. 74, 73-93 region (2, 34). We therefore propose a mechanism where the 16. Yansura, D. G., Goeddel, D. V., and Caruthers, M. H. (1977) Biochemistry 16, 1772-1780 critical contacts between RNA polymerase and thepromoter 17. Hawley, D. K., and McClure, W. R. (1980) Proc. Natl. Acad. Sci. U. S. A. 77,6381-6385 at -7 occur through the central partof the major groove and 18. McClure, W. R. (1980) Proc. Natl. Acad. Sci. U. S. A. 77,5634-5638 lead to aninduced local melting of the promoter (Fig. 9). The 19. McClure, W. R., Cech, C . L., and Johnston, D. E. (1978) J. Bid. Chem. 253,8941-8948 denaturation process would be initiated by the interaction of 20. Melancon, P., Burgess, R. R., and Record, M. T., Jr. (1983) Biochemistry carbonyl and amino groups from the T. A base pair at -7 22,5169-5176 with appropriate side chains on RNA polymerase. Thus 21. Siebenlist, U. (1979) Nature 279,651-652 L. E., Arfsten, A. E., Reusser, F., and Nornura, M. (1978) Cell 16, groups on RNA polymerase would shift thehydrogen bonding 22. Post, 215-229 potential of functional groups on the bases away from the 23. Berman, M.L., and Landy, A. (1979) Proc. Natl. Acad. Sci. U. S. A. 76, 4303-4307 base paired state. If this RNA polymerase-induced local melt- 24. Rosenberg, M., Chepelinsky, A. B., and McKenney, K. (1983) Science 222, 734-739 ing were accompanied by a conformational change of the 25. Stefano, J. E., and Gralla, J. D. (1980) J. Biol. Chem. 255,10423-10430 DNA (40) or protein (41),then perhaps even the one hydrogen 26. Maxam, A,, and Gilbert, W. (1977) Proc. Natl. Acad. Sci. U. S. A. 74,560bond remaining after tautomerization would not form or 564 D. K., Malan, T. P., Mulligan, M. E., and McClure, W. R. (1982) reformation of the keto-amino forms of the base pairs would 27. Hawley, in Promoters; Structure and Function (Rodnguez, R., and Chamberlm, M., eds) pp. 54-68, Praeger Scientific, Praeger Publishers, New York be inhibited. This process could not occur if T -A were reD. V., Yansura, D. G., and Caruthers, M. H. (1977) Nucleic Acids placed by transitions such as C .G or C .I where the orientation 28. Goeddel, Res. 4,3039-3054 of the amino and carbonyl groups are inverted relative to 29. Chamberlin, M. J., Rosenberg, S., and Kadesch, T. (1982) in Promoters: Structure and Function(RodIlguez, R., and Chamberlin,M.,eds) PP. 34RNA polymerase. The mechanism as shown in Fig. 9 is 53, Praeger Scientific, Praeger Publishers, New York supported by the pK,of adenosine N1 (4.8), the ability of 30. Lu, P., Cheung, S . , and Arndt, K. (1983) J. Bwmol. Structure Dymm. 1. 509-521 carboxylic acids with pK, similar to 4 (Asp and Glu) to 31. Fisher, E. F., and Caruthers, M. H. (1979) Nucleic Acids Res. 13,401-416 catalyze enolization of carbonyl groups, the ease of the ade- 32. Caruthers, M. H. (1980) Acct. Chem. Res. 13,155-160 33. Delort, A,"., Neumann, J. M., Molko, D., HervB,M., Tbule, R., and nine exocyclic amino groups to form additional hydrogen Dink, S. T. (1985) Nucleic Acids Res. 13,3343-3356 34. Simpson, R. B. (1982) in Promoters: Structure and Function (Rodriguez, bonds when part of B form DNA (42), and observations that R., and Chamberlin, M., eds) pp. 164-180, Praeger Scientific, Praeger DNA duplexes can be destabilized by acidic conditions (43). Publishers, New York A series of such catalytic events in the -10 region (usually 35. Sim son R. B (1979) Cell 18,277-285 36. McKure: W. R. (1985) Annu. Rev. Biochem. 54, 171-204 quite rich in A.T base pairs) could therefore be responsible 37. von Hippel, P.H., Bear, D. G., Morgan, W. D., and McSwiggen, J. A. (1984) Annu. Reu. Biochem. 53,389-446 for the DNA melting process. P. H. (1979) in Biological Regulution and Development (GoldThese results identify for the first time certain DNA base 38. vonberHimel. e; , R F , ed) pp. 279-347, Plenum,-New York pair functional groups that appear to be involved in transcrip- 39. Callaine C. R. (1982) J. Mol. Bid. 161,343-352 A."_ n RBI,-H d i M r C l n r ~W 24.2712-2723 ... R ___ (1985) "", Rinchmistrv .. tion of promoters by E. coli RNA polymerase. Moreover, the 41. Roe, J.-H., Burgess, R. R., and Record, M. T.,JL ( 1 9 h ) i.Mol. Biol. 184, 441-453 results further extend previous conclusions (28, 31, 32, 44) 42. Ts'o, P. 0.P. (1974) in Basic Principles of Nucleic Acid Chemistry (T'so, P. that thymine5-methyl groups are important contactsites 0.P., ed) Vol. 1, pp. 453-577, Academic Press, Orlando,FL between DNA and proteins. Thus thethymine 5-methyl may 43. Bloomfield, V. A,, Crothers, D.M., and Tinoco, I., Jr. (1974) Physical Chemistry of Nucleic Acids, pp. 334-336, Harper and Row, New York be a major distinguishing functional group which allows pro- 44. Yolov, A. A,, Vinogradova, M. N., Gromova, E. S., Rosenthal, A,, Cech, D., Veiko, V. P., Metelev, V. G., Kosykh, V. G:, Buryanov, Ya. I., Bayev, A. teins to read DNA sequences. The methyl group is, after all, A,, and Shabarova, 2. A. (1985) Nuclezc Actds Res. 13,8983-8998 the only functional group that uniquely defines a base pair by 45. Seeman, N. C., Rosenberg, J. M., and Rich, A. (1977) Proc. Natl. Acad. Sci. U. S . A. 74,966-970 one contact within a protein-DNA complex. Recognition of

_""

""

~~

"-("_)I__

~