homeodomain -DNA complex determined by nuclear - NCBI

2 downloads 35 Views 2MB Size Report
half-filter, and heteronuclear three-dimensional NMR experiments. Based on .... life of the complex as measured by gel mobility shift (Affolter et al., 1990a).
The EMBO Journal vol.9 no.10 pp.3085-3092, 1990

Protein DNA contacts in the structure of a homeodomain -DNA complex determined by nuclear magnetic resonance spectroscopy in solution Gottfried Otting, Yan Qiu Qian, Martin Billeter, Martin Muller1, Markus Affolter', Walter J.Gehring1 and Kurt Wuthrich Institut fur Molekularbiologie und Biophysik, ETH-Honggerberg, CH-8093 Zurich, Switzerland and 'Biozentrum der Universitit Basel, Abt. Zellbiologie, Klingelbergstr. 70, CH-4056 Basel, Switzerland Communicated by K.Wuthrich

The 1:1 complex of the mutant Antp(C39-S) homeodomain with a 14 bp DNA fragment corresponding to the BS2 binding site was studied by nuclear magnetic resonance (NMR) spectroscopy in aqueous solution. The complex has a molecular weight of 17 800 and its lifetime is long compared with the NMR chemical shift time scale. Investigations of the three-dimensional structure were based on the use of the fully 15N-labelled protein, twodimensional homonuclear proton NOESY with 15N(co2) half-filter, and heteronuclear three-dimensional NMR experiments. Based on nearly complete sequence-specific resonance assignments, both the protein and the DNA were found to have similar conformations in the free form and in the complex. A sufficient number of intermolecular 'H-'H Overhauser effects (NOE) could be identified to enable a unique docking of the protein on the DNA, which was achieved with the use of an ellipsoid algorithm. In the complex there are intermolecular NOEs between the elongated second helix in the helix-turn-helix motif of the homeodomain and the major groove of the DNA. Additional NOE contacts with the DNA involve the polypeptide loop immediately preceding the helix-turnhelix segment, and Arg5. This latter contact is of special interest, both because Arg5 reaches into the minor groove and because in the free Antp(C39- S) homeodomain no defined spatial structure could be found for the apparently flexible N-terminal segment comprising residues 0-6. Key words: Antennapedia homeodomain/DNA binding/gene expression/homeodomain -DNA complex/three-dimensional structure/two-dimensional NMR

conservation (McGinnis et al., 1984b). Evidence is now accumulating that homeodomain proteins act as transcription factors in which the homeodomain is involved in sequencespecific recognition of DNA (for reviews, see Levine and Hoey, 1988; Scott et al., 1989; Affolter et al., 1990b). Amino acid sequence comparison of the homeodomain with prokaryotic DNA binding proteins indicated that the homeodomain might contain a helix-turn-helix motif (Laughon and Scott, 1984; Shepherd et al., 1984). The determination of the three-dimensional structure of the Antp homeodomain by nuclear magnetic resonance (NMR) spectroscopy indeed demonstrated the existence of the postulated helix-turn-helix motif (Otting et al., 1988; Qian et al., 1989; Billeter et al., 1990). However, recent results obtained by site-directed mutagenesis imply a direct involvement in DNA binding for different amino acid residues of the helix-turn-helix motif than those that one might have anticipated by analogy to prokaryotic repressor proteins, for example the 434 repressor. In particular, a single amino acid substitution of the ninth residue of helix Ill seems to be sufficient to switch the DNA binding specificities of different homeodomains (Hanes and Brendt, 1989; Treisman et al.,

1989).

In a project aimed at the elucidation of the molecular basis of DNA recognition by homeodomains we previously determined the structure of the Antp homeodomain in aqueous solutions using NMR (Qian et al., 1989; Billeter et al., 1990). The protein fragment used contained 68 amino acid residues, which correspond to residues 297 -363 of the Antp protein and an additional N-terminal methionine introduced by the overexpression system used (Miller et al., 1988). This study was now extended to the complex of the Antp(C39-S) homeodomain with a 14 bp DNA fragment. The DNA fragment used corresponds to the BS2 site, which had been identified as a specific binding site by DNA footprinting and gel retardation assays (Muller et al., 1988). Residue 39 of the Antp homeodomain was changed from cysteine to serine to prevent oxidative dimerization of the protein (Muller et al., 1988). DNA binding studies showed that the mutant protein has the same DNA binding affinity as the wild type polypeptide. It binds to the BS2 site as a monomer with a binding affinity of 109/M and half-life of the resulting complex of -90 min (Affolter et al., 1990a). Furthermore, a full structure determination of the Antp(C39-S) homeodomain by NMR confirmed that the mutation does not significantly affect the protein conformation (Guntert et al., 1990). -

Introduction The homeobox is a 180 bp DNA sequence that encodes a 60 amino acid protein domain, the homeodomain. The homeobox was first found in the Antennapedia (Antp) and the fushi tarazu (ftz) genes, which are involved in the determination of segmental identity and segment number, respectively, in Drosophila (McGinnis et al., 1984a,b; Scott and Weiner, 1984). Subsequently, the homeobox was found in many other Drosophila development regulatory genes (Gehring, 1987; Scott et al., 1989). The homeobox has also been isolated from vertebrate genomes, which indicates extensive evolutionary conservation, and possible functional © Oxford University Press

Results and discussion Selection of the DNA fragment used to prepare the complex A systematic search was carried out to determine the shortest possible DNA fragment which shows a binding constant 3085

G.Otting et al.

comparable to the 26-mer used in earlier binding studies, i.e. d(AGCTGAGAAAAAGCCATTAGAGAAGC) (MUiller et al., 1988). G a C base pairs were placed at both ends of the DNA fragment for improved stability of the two ends of the duplex. The sequences d(GAAAAAGCCATTAGAG) (1 6-mer), d(GAGAAAGCCATTAGAG) (1 6-mer2), d(GGAAAGCCATTAGAG) (15-mer), d(GAAAAAGCCATTAGG) (1 5-mer2), d(GAAAGCCATTAGAG) (14-mer) and d(GAAGCCATTAGAG) (13-mer) were synthesized, and after combination with their complementary strands the binding constants with the Antp(C39-S) homeodomain were determined. The substitution of A3 to G3 between 16-mer and 16-mer2, and the subsequent deletions of A2 between 16-mer2 and 15-mer and GI between 15-mer and 14-mer did not significantly alter the binding constant with respect to that of the 26-mer. In contrast, any further deletion at the 5' end, for example, between 14-mer and 13-mer, or at the 3' end, for example, between 16-mer and 15-mer2 lead to reduced binding affinities. Therefore we decided to use the 14-mer for the present study. Preparation of a 1:1 complex between Antp(C39-S) and the DNA 14-mer A 0.5 mM aqueous solution (90% H20/10% D20, buffer: 25 mM sodium phosphate, 100 mM KCl, 2 mM NaN3, pH 6.0) of the DNA 14-mer was titrated at 20°C with

uniformly 15N-labelled Antp(C39-S) homeodomain. Figure 1 shows the imino proton resonances of the DNA at different ratios of protein:DNA. For titration ratios between 0.0 and 1.0 the imino proton spectrum is a superposition of those of the free DNA (titration ratio 0.0) and the 1:1 complex (titration ratio 1.0), indicating that the exchange between free and bound DNA is slow on the DNA 14-mer+Antp(C39 -S) Homeodomain T,3T,,T4 T3T2 I Ar

TgT,,

T,

G,

II

G, I

G7 G2 ID

[Protein] N AA]

0.6

14.0

13.0

8(ppm)

Fig. 1. Imino proton region from 12.0-14.3 p.p.m. of the onedimensional 'H NMR spectra obtained upon stepwise addition of Antp(C39-S) homeodomain to a solution of the DNA 14-mer in 90% H20/10% D20, pH 6.0, 20°C. The molar ratios of protein:DNA are indicated on the right. The assignments of the chemical shifts for the free and complexed DNA are given below and above the corresponding imino proton spectra, respectively. Bars indicate groups of resonances which have not been assigned individually at 20°C. 3086

chemical shift time scale. This is in accordance with the halflife of the complex as measured by gel mobility shift (Affolter et al., 1990a). At titration ratios > 1.0 a precipitate formed, which could be redissolved by further addition of 14-mer. The imino proton spectrum observed for the 1:1 complex did not change upon the addition of an excess of the protein, except that the amplitude decreased with increasing amounts of protein added. A plausible explanation for these observations would be that the precipitate formed was an insoluble complex with two molecules of Antp (C39- S) bound to one 14-mer. A solution of the 1:1 complex in the aforementioned solvent medium showed broad line-widths at concentrations > 1 mM. In contrast, only little line-broadening relative to the 0.5 mM solution used to record the spectra of Figure 1 was observed for a 3.5 mM solution of the 1:1 complex prepared without the addition of KCl and phosphate buffer. Comparison of ['5N,'H]-COSY spectra recorded, respectively, with the 3.5 mM salt-free protein solution and with the 0.5 mM buffered protein solution under otherwise identical conditions did not reveal any significant changes in chemical shifts, indicating that the conformation of the complex is insensitive to the presence of the salt and the buffer used in the more dilute protein solution. Circular dichroism (CD) measurements with a 0.1 mM solution of the Antp(C39- S) - 14-mer complex at pH 6.0 showed no evidence for denaturation of the complex in the temperature range 18 -50°C. 1H NMR assignments for Antp(C39-S) and the DNA 14-mer in the 1:1 complex Sequence specific 'H NMR assignments were previously obtained for both the free Antp(C39- S) homeodomain (Guntert et al., 1990) and the free DNA 14-mer. However, the spectral changes observed upon complex formation were so extensive that new assignments had to be worked out for both the protein and the DNA in the complex. This was achieved using ['H, 'H]-NOESY with '5N(W2)-half-filter (Otting and Wuthrich, 1990) and three-dimensional 15Ncorrelated ['H,'H]-NOESY (Messerle et al., 1989). To avoid transfer of saturation from the water to the solute molecules, the experimental schemes for both experiments were designed for measurements in H20 solution without water saturation by preirradiation (Otting and Wuthrich, 1989; Messerle et al., 1989). Figure 2A and B shows the sum spectrum and the difference spectrum, respectively, of a ['H, 'H]-NOESY experiment with 5N(&2)-half-filter (see Materials and methods for experimental details). The '5N(W.2)-half-filter discriminates between protons bound directly to 15N and all other protons, where this discrimination is applied only along the W2 frequency axis. Thus, the sum spectrum (Figure 2A) contains only peaks with W2 frequencies of unlabelled protons, while the difference spectrum (Figure 2B) contains only peaks which correlate with the resonances of '5Nbound protons along the Cw2 frequency axis. Both spectra contain the peaks correlating with all proton resonances along w1. Since the protein is uniformly enriched with 15N to the extent of >95%, the difference spectrum (Figure 2B) contains the diagonal peaks and cross peaks with the amide protons of the protein, while the sum spectrum contains the diagonal peaks and cross peaks with all DNA resonances and with those protons of the protein which are not bound to I5N. In the spectral region shown in Figure 2 these are

Protein - DNA contacts determined by NMR

the protons of the aromatic side chains and some hydroxyl protons of amino acid side chains. Thus the '5N(W2)-halffilter technique effectively separates the bulk of the protein from the DNA spectrum. A conventional spectrum [H, H]-NOESY spectrum would contain the peaks from both subspectra shown in Figure 2A and B. The improvement in resolution gained with the use of the '5N(O2)-half-filter was sufficient to obtain sequence-specific assignments for the DNA 14-mer from the sum spectrum (Figure 2A), and to identify a large number of sequential connectivities in the protein (Wiithrich, 1986) in the difference spectrum (Figure 2B). However, most of the protein assignments could only be ascertained with the use of a three-dimensional (3D) '5N-correlated ['H, 'H]NOESY experiment. The frequency axes of this 3D NMR spectrum are 'H in xl, '5N in w2 and 'H in w3. The spectrum was analysed in plots of the co -3 cross sections. Figure 2C shows the 22nd out of a total of 64 planes obtained, which are separated by the '5N chemical shifts corresponding to the '5N-bound protons observed in the co13 dimension. It illustrates the dramatic improvement in resolution gained from the development of the spectrum in an additional dimension. (The difference spectrum of Figure 2B corresponds to the projection of the peaks in all 64 planes of the 3D NMR spectrum onto a single wj(1H)-W3(0H) frequency plane (Otting and Wuthrich, 1990).) For the presently studied protein-DNA complex the usual strategy of sequential resonance assignments of proteins, which involves spin-system identifications by scalar ['H, 'H]

couplings prior. to the use of NOESY for delineation of sequential connectivities (Wuthrich, 1986) could not be applied because the COSY and TOCSY spectra were of poor quality. Therefore a strategy relying exclusively on NOE data was adopted, which started from the assumption that the conformations of both the protein and the DNA are similar in the 1:1 complex and in the free compounds. The patterns of intramolecular NOE connectivities would then be expected to be conserved, even though the chemical shifts may be sizeably different. The assignments achieved on this basis turned out to be self-consistent, and eventually confirmed the starting assumptions made. A few otherwise unassigned intramolecular NOEs could finally be identified as signals from the slowly exchanging hydroxyl protons of Tyr8, Thr9 and Thr4l, which were not observed in the free protein due to rapid exchange with H20. In summary the sequence-specific resonance assignments obtained for the Antp(C39-S) homeodomain and the DNA 14-mer in the complex are almost complete. All proton resonances of the polypeptide backbone were assigned, with the exception of MetO and Argl, where the amide protons exchange too rapidly with the solvent to be observable. 3-Proton and 'y-proton resonances were assigned for 60 and 40 residues, respectively, and for about half of the 68 amino acid residues the resonance assignments include all nonexchangeable side chain protons. For the 14-mer all nonexchangeable base protons and all 1' sugar protons were assigned. With two exceptions all 2'H, 2"H and 3'H resonances were also assigned. Furthermore, intramolecular

(03('H) Fig. 2. Spectral region (w1 = -1.0-10.5 p.p.m., w2 = 5.0- 10.5 p.p.m.) of '5N-edited ['H,'H]-NOESY spectra of a 3.5 mM solution of the 1:1 complex formed by uniformly 15N-labelled Antp(C39-S) homeodomain and DNA 14-mer at 36°C. (A) Sum spectrum and (B) difference spectrum of ['H,'H]-NOESY with '5N(w )-half-filter, pH 6.8, mixing time 110 mis. (C) c, -W3 cross plane taken at w2 = 123.8 p.p.m. through the threedimensional 5N-correlated ['H, HI-NOESY spectrum, pH 6.0, mixing time 40 ms. The w, dimension is shown only up to 0.0 p.p.m. 3087

G.Otting et al.

[helix 5 10 R Q T Y T R Y

loop 20

15

25

T L E L E K E F H F N R Y L T R R

R 30

R

oc5'GAAAGCCATTAGAG 3' 5

1

10

X' CD

14

E

33'CTTTCGGTATCTC 5' A 35 H

A L

N E K KWK M R R N Q FW 50 55 60

helix IV

helix

O O O OO O O O

K

Q

45 Ill

O O O

O

R E T L S 40 turn

OOu?

Fig. 3. Survey of the experimental data on the 1:1 complex between the Antp(C39-S) homeodomain and the DNA 14-mer. The sequence of the 14-mer is given in the centre with the numeration used for the base pairs. The two strands are arbitrarily denoted and ,B. The Antp(C39-S) amino acid sequence is arranged clockwise around the 14-mer. The terminal residues 0-4 and 61-67 have been omitted, since no reliable experimental data were obtained for these polypeptide segments in the free protein, and thus no comparisons with the complex were possible. The secondary structure of the protein observed in the free form and in the complex is indicated alongside the amino acid sequence. Bold letters identify those amino acid residues or nucleotides for which changes in chemical shifts larger than or equal to 10.21p.p.m. were observed upon complex formation. Squares alonggde the polypeptide sequence identify the residues with slow amide proton exchange in the complex, where ? indicates that a spectral artifact prevented the measurement of the NH exchange for Ser39. Finally, the arrows identify intermolecular contacts evidenced by NOEs between the protein and the DNA (Table I). a

NOEs with the group of resonances involving the 4'H, 5'H and 5 "H protons were assigned for most of the nucleotides, without attempting to assign these protons individually. Similar ambiguities also arose for some of the side chain resonances of the protein, where the lack of scalar coupling connectivities prevented the unambiguous distinction between resonances with similar chemical shifts, e.g. C"H2 and C6H2 of lysyl side chains. Conformations of the Antp(C39-S) homeodomain and the DNA 14-mer in the 1:1 complex on the essentially complete sequence-specific resonance assignments for the Antp(C39-S) homeodomain

polypeptide segments with non-regular secondary structure, respectively. For the latter this information came from measurements of the vicinal coupling constants 3Jnoz obtained by recording a series of J-modulated N, HCOSY experiments (Neri et al., 1990). In the free protein these data were collected at pH 4.3 and 20°C, and in the complex at pH 6.0 and 36°C. In the complex the 3JHNa coupling constants could be measured with a precision of 1 Hz for the residues 3, 4, 6-9, 12, 23, 24, 26, 28, 39, 41, 42, 60-64 and 66. In spite of the somewhat different solvent conditions used for the two measurements the 3JHNai values did not indicate any significant conformational change. Even for the three residues where a difference between the free and complexed states of the protein could be detected within the accuracy of the measurements, i.e. Thr7, Tyr8 and Asn23, the variations were small, i.e. the corresponding coupling constants were - 1 Hz smaller in the complex. For most residues in the helices I, II and III the 3JHNCD values were found to be