Primary Structure of the Protein* Bacteriophage T4 DNA Helix ...

3 downloads 99 Views 850KB Size Report
Jul 7, 1980 - Since Anderson and Coleman (16) have previously suggested that 5 of ... also wish to thank Gary Davis for running the sequenator and the gas.
THEJOURNAL OF BIOLOGICAL CHEMISTRY Vol. 256 No. 4 Issue of February 25. pp. 1754-1762, 1981 PrmtedLn U.S‘.A.

Primary Structureof the Bacteriophage T4 DNA Helix-destabilizing Protein* (Received for publication, July 7, 1980)

Kenneth R. Williams$, Mary B. LoPresti, and Masayuki Setoguchig From the Departmentof Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06510

The amino acid sequence of the single-stranded DNA- viruses and by transformation (2). Because the gene 32 protein binding protein encoded by gene 32 of bacteriophage has been well characterized genetically (3-6) and has been T4 has beendeterminedbymanual and automated shown to be involved in T4 DNA replication (7), repair (a), sequencing of peptides derived from cyanogen bromide and recombination ( 9 ) ,this particular ssDNA-binding protein cleavage and digestion with trypsin, chymotrypsin, and is an excellent choice for the study of the role of helix-destastaphylococcal protease. -tic digestion of citracon- bilizing proteins in a wide variety of reactions involved in ylated or succinylated gene32 protein yields five pep- DNA metabolism in vivo. In addition, the availability of tides containing 4,27,42, 66, and 163 residues, respec- mutants in bacteriophage T 4 that overproduce the gene 32 tively, which can be separated by Sephadex chromaprotein further encourages detailed structural and functional tography. Each of these tryptic peptides was subjected studies on this particular ssDNA-binding protein. As a fist to automated sequencing and, if necessary, moreextensive cleavage. The gene 32 protein contains301 amino step in interpreting structure-function relationships, we have acids and has a molecular weight of 33,487. Based on determined the amino acid sequence of the T4 DNA helixits primary structure, the gene 32 protein is predicted destabilizing protein.’ This information will allow more comto contain 36% a helix, 18%/3 sheet, and 46% random plete interpretation of the physicochemical and functional coil. The native protein can be specifically cleaved at properties of the gene 32 protein which are currently being lysine 21 and 253 by limited trypsin digestion. Previous investigated. With the exception of the gene 5 protein of studies have shown that the “B” region (residues 1 to bacteriophage fd, which is the most fully characterized of the ssDNA-binding proteins (see Ref. 2 for a review), there is 21) is essential forcooperativebindingtosinglestranded DNA. The “A” region (residues254 to 301) has almost no structural information available on any of the other helix-destabilizing proteins. been implicated in controlling the helix-destabilizing “activity” of gene 32 protein and in interacting with MATERIALS AND METHODS other T4 DNA replication proteins.The “A” region has Most of the materials and methodsused in sequencing the T4 gene a net chargeof-10and, in addition,contains two unusual stretches of 4 serine residues separated by 32 protein have been described previously (IO). Additional information may be found in the miniprint supplement to this paper.3 glycine 284. The region between positions 72 and 116 contains 6 of the 8 tyrosine residuesin the protein and RESULTS may be important forDNA binding. The complete amino acid sequence of the T4gene 32 protein is shownin Fig. 1. As much of the primary structure as possible was determined by automated sequencing of 32P The protein encoded by gene 32 of bacteriophage T4 (32P)’ peptides resulting from partial proteolysis, cleavageat methihas served as a prototype for a class of proteins which bind onine, or tryptic digestion of succinylated 32P. The additional preferentially to ssDNA and, therefore, destabilize dsDNA peptides needed to complete the sequence were obtained by (1).These helix-destabilizing proteins have been isolated from enzymatic digestion of peptides derived fromtryptic digestion a wide variety of sources, including both prokaryotic and of citraconylated 32P. Only those peptides that were actually eukaryotic cells (2). In some instances, specific helix-destabi- sequenced are indicated in Fig. 1; numerous additional peplizing proteins are induced in cells by bacterial and animal tides were obtained which were not sequenced because they originated from regionsof known primary structure. In every * This work was supported in part by United States Public Health instance, the amino acid compositionand NHz-terminalamino Service National Institutes of HealthGrant GM12607 and by a acid of these additional peptides (see miniprint supplement) Swebilius Cancer Research award. The costs of publication of this were in agreement with the proposed sequence for 32P. The article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accord- procedures involved in determining the primary structure and the sequence data obtained are outlined below and are deance with 18 U.S.C. Section 1734 solely to indicate this fact. f To whom correspondence should be addressed at Yale University, scribed in greater detail in the supplement. Department of Molecular Biophysics and Biochemistry, P. 0. Box 3333, New Haven, Connecticut 06510. 9 Current address, 43 Shimonishi Yama-CHO, Nagasaki City, Japan, 850. ’ The abbreviations used are: 32P, the DNA helix-destabilizing protein encoded by gene 32 in bacteriophage T4; 32P’-A, tryptic cleavage product of 32P which lacks the COOH-terminal “A” region (residues 254 to 301); 32P*-(A + B), tryptic cleavage product of 32P which lacks the NHz-terminal “B” region (residues 1 to 21) and the COOH-terminal “ A ’ region; ssDNA, single-stranded DNA; dsDNA, double-stranded DNA.

A preliminary account of this work may be found in Ref. 10. Portions of this paper (including “Materials and Methods,” “Results,’’Tables I11 to XIX, and Figs. 4 to 27) are presented in miniprint at the end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are available from the Journal of Biological Chemistry, 9650 Rockville Pike, Bethesda, Md. 20014. Request Document No.8OM-1385, cite author(s), and include a check or money order for $16.00 per set of photocopies. Full size photocopies are also included in the m i c r o f h edition of the Journal that is available from Waverly Press.

1754

Amino Acid

Sequence of the T4 DNA-binding Protein

1755

20

~ ~ t ~ P h o ~ L y s ~ h ~ ~ ~ ~I O ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ h l ~ - C L ~ - ~ ~ - ~ lSephadex ~ - A ~ ~ - or ~ ~ DEAE-Sephadex ~ " * ~ ~ X ~ ~ L Y s chromatography. ~ L e u " G Z ~ ~ h ~ ~Therefore,

the approximately equimolar mixture of the two peptides was subjected to automated sequencing without further p d i c a tion. In addition to confirming the sequence from residues 5 I to 22 and 47 to 61, the run on A42, 3) extended the known 60 50 Gly~ln-~l.-~.l-~le-~~~-~~-~~-p~~-~r-Lys-Am-~~~lu~l~-N~-P~O-~he-N~-11esequence to residue 76 (Fig. 1).Automatic sequencing of A-4 -YP'-(A+0) Iand A-5 established the primary structure from residue 112 to 8 0 70 L e u - v ~ ~ - ~ s n - ~ ~ ~ ~ l y - P h r - L y . - L y ~ - ~ ~ ~ l y - L Y . - ~ P - ~ - ~ ~ ~ ~ ~ ~ - ~ ~174. - ~ ~ -Since s ~ r - sA-5 e r -does ~ r - not contain arginine (lo),A-5 must be the +1-7 I A-3 COOH-terminal peptide. c " c-3 - c - 4 9 y 100 H L s ~ ~ y ~ ~ ~ p . ~ y . ~ ~ . p - A 1 ~ - C y ~ - P ~ ~ - V ~ l ~ y ~ ~ l uCyanogen - ~ ~ - ~ ~ Bromide ~ - ~ ~ ~ -Cleauuge-Amino ~ Y ~ ~ ~ ~ ~ ~ A ~ Pacid ~ ~ uanalysis ~ ~ ~ ~ ~of~ A-5 ~ +c "T-9 -1-8 (10) indicated that this fragment contained most of the me"-5I" 110 I20 thionine - out of a total of 9) residues in 32P. Therefore, the ~ r - A r p - ~ a n - L y . ~ l u - ~ ~ - ~ ~ ~ - ~ ~ - V . 1 - ~ y ~ - A ~ ~ - L y ~ - ~ r - S e ~ - ~ ? - ~ P - A l ~ - A ~ ~ - ~ l e - ~ U(7 OI -""I intact 32P was subjected to cyanogen bromide cleavage to ___ c-6 -c-7 d S P - 1 ILO generate additional peptides from the COOH-terminal half of \1~l-~.~-~~.-~.~-~~~-~.-N.-p~~~l~-~~~l~-Cly-Ly~-\'al-Phe-LY~-~~-Ar8-~h~-~~Y" 32P. After cyanogen bromide digestion of 1.8 pmol of carboxA-4 I-C-8" amidomethylated 32P (10) and Sephadex G-100 chromatog160 150 ~ y ~ ~ ~ y ~ - ~ ~ C ~ ~ l p - ~ s p - L y ~ - I l ~ - A ~ ~ - A l ~ - n ~ t - I l ~ - A l ~ - V raphy, ~ l - ~ ~ Vfive - ~ ~fractions l - ~ l n - ~were ~ ~ - ~obtained ~ Y ~ ~ ~ ~ ~s~shown ~ ~ in Fig. 2. The first A-5 w two pools contained CnBr-3 and CnBr-5, respectively, while 170 I80 ~ r . ~ ~ s ~ - A . p ~ v . l ~ ~ ~ ~ ~ C * . . ~ P ~ ~ - l ~ p ~ l ~ - G l y - A ~ ~ - A ~ n - P h e - ~ ~ l - ~ ~CnBr-2, - L Y ~ - ~ ~ CnBr-6,7, l - L Y ~ - ~ ~ n -and ~ ~ ~ CnBr-9 were purifiedby rechromatog* CnBr - 5 raphy of the respective Sephadex G-100 pools on Bio-GelP-6 1 90 200 (see Table I1 and miniprint supplement). Automated sequenc-".yL-r.S-~l~v~A-r~-~~~-reS-~hP-yl:~~~~ C-13 ing of CnBr-5 confirmed much of the proposed sequence for SP-3 220 the NH2 terminus of A-5 as well as extended the known 210' ~sp-A~~-Glu-Ser-Phc-G1n-Ly."Lcu-Ph~~l~~l~~et-~~~-A~P-~~-~~~~~U~eC-~hr+ C-14I sequence to residue 188. As detailed in the supplement, the c "SP-5 CI" S P - 4 1-16 __ 1-15 sequence of the first 18 to 20 amino acids in both CnBr-6,7 2*0 230 S e r - L y s - A s n - L y . - P h e - L p . - S c ~ - P ~ ~ l ~ ~ l u - ~ u - ~ D - ~ ~ - L y . - ~ - G l y ~ l n - V ~ l - ~ c - G ~ Y - and CnBr-9 was determined by automated sequencing. cnnr-6,7I T-19"-sP-l"---D SP-0 Proteolytic Cleavage of A-2, 3"To permit tryptic digestion 1-17I;P 260 T h r - A l ~ - V ~ l ~ f - G l y ~ l y - N ~ - N ~ - A l ~ - ~ ~ - N ~ - N ~ - L Y ~ - ~ ~ - A h - l i s p - L y a - V a l - A l a - A sof p - A-2, 3, these peptides were prepared using citraconylated +c-21I rather than succinylated 32P. After tryptic cleavage at argi\ CnBr-9 270 zeo A s p - L e u - A e p - A l a - P h e - A s n - V a l - A s p - A s p - P h . - A s n - ~ I ~ s - ~ ~ ~ u - A ~ - A ~ - ~ - ~ t - . ~ ~ - nine and Sephadex G-100 chromatography, the pool correT-20,21 c "1 - 2 2 sponding to A-2, 3 was decitraconylated and then digested SP-IO with trypsin as described in the miniprint supplement. The 290 300 Se~Se~Ser-Cly-Sor-S~r-Ser-Ser-ALo-Asp-Asp-Thr-Aqp-Leu-A~-Asp-~u-L.u-Am-A~-~~ resulting peptides were fractionated on Bio-Gel P4 and then SP-I2 FIG. 1. The complete amino acid sequence of the bacterio- purified further by Aminex 50W-X4 chromatography. Of the phage T4 gene 32 protein. The single headed arrow indicatesthe tryptic peptides that were isolated, four (T-7,8, 9, and 10) had last residue sequenced in each peptide which, because some of the amino acid compositions that did not match that predicted peptides were only partiallysequenced, is not necessarily the COOH for any tryptic peptides from the fwst 69 amino acids in 32P. terminus of the peptide. Cleavagemethods used are:C a r , cyanogen T-8 was isolated on the basis of its insolubility in 0.05 M bromide; 32PS-(A + B), partialproteolysis; A, trypticdigestion of 7 C n B r - 3 -

32P

t

-

*o

30

~ra-~ly-ph~-s~~-s~~~l~-A.p-L~~~lY~l~-~P-~Y*-~u-L~s-~-~~-~-N'~1y-~'-

d

. - -"t

"""

"

~

-

- ---- --

-

).

-

succinylated or citraconylated 32P; T, trypsin, C, chymotrypsin, and SP, staphylococcal protease. The "B" region (residues 1 to 21) and the "A'region (residues251 to 301) are italicized.

Automated Sequence Analysis of 32P and 32P"-(A + B)Automated sequencing of 75 nmol of carboxamidomethylated 32P established the sequence of the f i t 20 amino acids. These results confirmed our earlier fiidings (11) with the exception that residue 6 was determined to be serine rather than alanine. Since our earlier results indicated that 32P*(A + B) begins at residue 22, automated sequencing was also carried out on 180 nmol of carboxamidomethylated 32P*-(A+ B). As indicated in Fig. 1, usefuldata were obtained from the first 40 cycles, thus extending the known sequence to residue 61. Tryptic Cleavage of Succinylated 32P"Amino acid analysis of 32P showed the presence of only 4 arginine residues (Table I). Since there are 2 arginine residues in the fwst 60 amino acids, this meant that the approximately 240 residues remaining contained only 2 additional arginines. In order to obtain additional fragments for automated sequencing, 1.4 pmol of carboxamidomethylated 32P was succinylated and then cleaved at arginine by trypsin (10). After separation of the resulting peptides on Sephadex G-100, four fractions were obtained, A-1, A42, 3), A-4, and A-5 (lo), which contained residues 1 to 4, 5 to 46 and 47 to 111, 112 to 138, and 139 to 301, respectively. Although cleavage occurred at arginine 46, the two peptides, A-2 and A-3, failed to separate upon further

TABLEI Amino acid composition of 32P and its partial tryptic cleavage products Data arefrom Ref. 11 andare expressed in terms of numberof residues per mol based on the calculated molecular weights of33,487 for 32P and 26,024 for 32P*-(A + B). Numbers in parentheses are from the sequence in Fig. 1. Amino acid

Cysteine" Aspartic acid/asparagine Threonine Serine Glutamic acid/glutamine Proline Glycine Alanine Valine Methionine Isoleucine Leucine Tyrosine Phenylalanine Histidine Lysine Arginine (1) Tryptophan

32P

32P'-B)

(A

+

"A'

4.3 (4) 4.3 (4) 49.6 (51) 33.6 (32) 13.7 (17) 13.0 (14) 9.4 (10) 2.6 (3) (1) 22.5 (25) 15.4 (16) 5.5 (8) 29.8 (28) 26.9 (25) 0.7 (1) 9.7 (8) 8.4 (8) 19.5 (18) 17.1 (16) 0.5 (I) 25.0 (26) 17.3 (18) 3.3 (4) (4) 19.3 (19) 16.8 (17) 2.4 (2) 9.1 (9) 2.3 (2) (1) 5.8 (6) 10.0 (IO) 9.9 (10) 0.5 19.9 (19) 12.8 (12) 5.5 (5) (2) 7.9 (8) 7.8 (8) 17.2 (18) 14.1 (14) 2.3 (3) (1) 2.3 (2) 2.5 (2) 32.5 (33) 27.5 (26) 1.1 (3) (4) 4.2(3)(4) 3.0 5.1 (5) 5.2 (5) 301 232 48

"B"

(2) (1) (2) (1)

21

The numberofsulfhydryl groups in 32P has previouslybeen estimated as 3.7/mol of protein (lo), thus 32P contains 4 cysteine residues and no disulfide bonds.

Amino Acid Sequence ofthe T4 DNA-binding Protein

1756

pyridine acetate, pH 3.2, (see supplement). Automated sequencing of 275 m o l of T-8 established the complete sequence for this peptide, while T-7,9,and 10 were sequenced manually. T-9and T-10 were placedin the primary structure by isolating and sequencing the overlapping chymotryptic and staphylococcal protease peptides indicated in Fig. 1. Proteolytic Cleavage of A-5-To complete the sequence of A-5 it was necessary to isolate at least some of the tryptic, chymotryptic, and staphylococcal protease peptides from A5. The details of the isolation of the peptides from these digests together with the analytical and sequence data are described in the supplement and the peptides required to establish the sequence are shown in Fig. 1. Of the peptides that were isolated from A-5, SP-3, T-20, 21, and SP-12 were sequenced by automated sequencing in the presence of poly-

4.0

$?,

r

30

N

brene, whilethe remaining peptides were sequenced manually. That SP-12 actually contains the COOH terminus of 32P was confirmedby the finding that only leucine is released by extensive carboxypeptidase A digestion of intact A-5 as well as T-22 and SP-12. In addition, SP-12 was shown to overlap T-22 (Fig. 1) which wasthe only tryptic peptide isolated from A-5 that did not contain lysine. Chymotryptic Digestion of 32P*-(A+ B)-To eliminate the possibility that there were any additional peptides in between A-3 and A-4 or A-4 and A-5 and to determine the COOH terminus of 32P*-(A + B), this proteolytic fragment of 32P was digested with chymotrypsin. Since 32P*-(A + B) was prepared by partial tryptic cleavage of 32P (12), then the COOH-terminal chymotryptic peptide from 32P*-(A + B) should be unique in that it w li end inlysine. Of the 15 chymotryptic peptides isolated from 32P*-(A + B) (see the supplement for details), only C-21, residues 251 to 253 (Fig. l),ended in lysine. Our previous carboxypeptidase data (11) are also consistent with the assignment of lysine residue 253 as the COOH terminus of 32P*-(A + B). The isolation of C-7 and C-8 (Fig. 1) confirmed the fact that there areno additional peptides in between A-3 and A-4 or between A-4 and A-5. DISCUSSION

-

:

J

-

The molecular weight of 32P based on the sequence in Fig. 1 is 33,487, which is close to thevalue of 35,000 as determined m by gel filtration or by sodium dodecyl sulfate gel electrophoresis (13). The amino acid composition predicted from the 0 i5 C n L - 9 CnC-4 cnb-to sequence in Fig. 1 is in agreement with published data on the amino acid composition of 32P (11, 14-16) and 32P*-(A + B) (11, Table I). The good agreement between the predicted and 0 20 40 60 80 I D 0 120 140 160 1 8 0 200 240 220 actual amino acid compositions of the cyanogen bromide FRACTION FIG.2. Sephadex 6-100 chromatography of cyanogen bro- peptides (Table 11) and the trypticpeptides from succinylated mide peptides from 32P. Cyanogen bromide (0.15 g) was added to 32P (10) further substantiates the sequence obtained. The secondary structure of 32P, as predicted from the 1.8 pmol of carboxamidomethylated32P dissolved in 10 ml of 70% formic acid. After24 h at room temperature,the reaction was stopped amino acid sequence by the method of Chou and Fasman (171, by the addition of 90 ml of water and lyophilization. The resulting is shown in Fig. 3. Based on this analysis, 32P contains 36% peptides were dissolved in 5 ml of 20 mM Tris/HCl, pH 8.4,, and 6 M (Y helix, 18% p sheet, and 46% random coil. Since circular guanidine hydrochloride and applied to a column (2.0 x 145 cm) of Sephadex G-100 equilibratedat a flow rate of 4 ml/h with the same dichroism spectra have previously indicated that 32P contains buffer. Fractiom (2 ml) were collected and pooledas shown above on 22% (Y helix, 26%/3 sheet, and 52%random coil (2), the Chou and Fasman analysis may be overestimating the amount of (Y the basis of their absorbanceat 230 m i . TABLEI1 Amino acid composition of cyanogen bromide peptides Amino acid

CnBr-2

CnBr-3

1.8 (3) Cysteine 0.5 20.8 (22) Aspartic acid/aspmagine 1.0 (1) 3.7 (4) Threonine 1.0 (1) 8.3 (8) Serine 2.4 (2) 11.0 (IO) Glutamic acid/gluta-' mine 5.5 (5) Proline 10.0 (IO) 0.3 Glycine 2.5 (3) 9.0 (10) Alanine 7.1 (7) Valine 0.7 (1) 0.7 (1) Methionine" 6.8 (7) Isoleucine 8.5 (8) 1.0 (1) Leucine 7.8 (7) Tyrosine 7.5 (6) 1.1 (1) Phenylalanine 2.7 (2) Histidine 2.1 (2) 18.0 (19) Lysine 3.1 (3) 0.8 (1) Arginine N.D.b (4) Tryptophan 2- 14 15-150 Residue numbers 29Yield 48 48 Determined as homoserine plus homoserine lactone. Not determined.

CnBr-5

CnBr-6,7

CnBr-9

0.5 (1) 9.3 (8)

3.3 (3)

8.9 (IO)

2.4 (2) 3.9 (5) 9.3 (10)

2.0 (2) 2.3 (3) 4.0 (4)

3.5 (3) 0.7 2.0 (1)

1.3 (1) 0.8 2.0 (2) 1.0 (2)

2.0 (2) 6.8 (8) 2.2 (2) 1.0 (1)

3.3 (3) 0.7 (1)

1.7 (2)

0.9 (1)

4.0 (5)

2.8 (3)

3.1 (3)

4.1 (4) 0.3 N.D.* (I) 158-213

3.6 (4)

3.8 (4)

3.1 (3) 3.2 (3) 4.2 (2) 4.7 (5)

214-239

245-279 17

1757

Amino Acid Sequence of the T4 DNA-binding Protein *+*

-

115

2 24

FIG.3. Schematic diagram of the predicted secondary structure in thegene 32 protein. The secondary structure predictionsare from a computer program (18) based on the method of Chou and Fasman (17). Residues are represented in helical (d), p sheet ( A ) , and coil (-) conformations. p turns are denoted by chain reversals. Position of charged residues is indicated, and conformational boundary residues are numbered.A and B indicate the trypsin-sensitive bondsin the native proteinwhich when cleaved give rise to 32P’-A and 32P’-(A + B) (11).

helix and underestimating the amount of /3 sheet. As depicted in Fig. 3, 32P appears to contain three domains with respect to secondary structure. The NH2-terminal region (residues 1 to 35) and the COOH-terminal region (residues 187 to 301) are mostly a helical, while the central region of 32P (residues 36 to 186) contains most of the /3 sheet and 11 of the 15 predicted p turns. The susceptibility of the “A” (residues 254 to 301) and “B” (residues 1 to 21) regions of 32P to proteolysis by trypsin suggests that both the lysine-glycine (residues 21 to 22) and lysine-lysine (residues 253 to 254) sequences must be on the surface of the native 32P. In line with this, the first predicted /3 bend in 32P occurs between residues 18 and 21. In contrast, the point of cleavage of the “A” region (residues 253 to 254) is in the middle of the longest predicted stretch of a helix. In uzuo studies on temperature-sensitive and amber mutants in gene 32 suggest that this protein is multifunctional; that is, 32P has different domains for interacting with DNA, DNA ligase, DNA polymerase, and recombination nucleases (3-5, 19,20). Since amber peptides of 32P containing less than -150 amino acids retain most or all of the interaction sites that are essential for initiation of DNA replicatioon (3), it appears that the NH2-terminaldomain is most important for DNA binding. This conclusion is supported by in vitro studies of structure/ function relationships in 32P and by some unusual features of the 32P primary structure. Differential scanning microcalorimetry studies of 32P and its proteolytic fragments in the presence of poly(dT) have shown that the NHz-terminal “B” region (residues 1 to 21) is essential for cooperative binding of 32P to =DNA (12). Direct measurements of the affinity of 32P for polynucleotides uersus oligonucleotides indicate that these cooperative 32P:32P interactions contribute -IO3 to an overall binding constant of -7. lo8for =DNA (21,22).Within the NHn-terminalhalf of 32P, the region between residues 72 and 116 is particularly unusual in that it contains 6 of the 8 tyrosine residues in 32P. Five of the six tyrosine residues in this region are separated by 6 to 8 amino acids and are, therefore, almost equally spaced in the primary structure.

Since Anderson and Coleman (16) have previously suggested that 5 of the tyrosine residues in 32P participate in DNA binding by intercalating between the bases of ssDNA, it is tempting to speculate that this region of 32P (between residues 72 and 116) is important for DNA binding.Because tyrosine intercalation is known to be involved in the binding of the fd ssDNA-binding protein (5P) to DNA (23), it is of interest tocompare the structure between residues 72 and 116 of 32P with those regions of 5P involved in ssDNA binding. Although there is no extensive similarity between the amino acid sequences of the ssDNA-bindingproteins from fd and T4, we have noticed the following limited homology: 32P Lys-Glu-Tyr-Ser -

Leu-Val-Lys (110)

(104)

5P Lys-Pro-Tyr-Ser -

Leu (28)

(24)

5P G h -Tyr-Pro- Val -Leu-Val-Lys (40)

(46)

In the second instance cited above, the homology involves tyrosine 41 of 5P, which has been shown by a number .of criteria to be involved in 5P:ssDNA interactions (2). There is also some similarity between the predicted secondary structure of the 32P region containing residues 72 to 116 and the 5P region involved in DNA binding. Most of the amino acids involved in DNA:protein interactions in 5P are in a threestranded antiparallel fl sheet arising from residues 12 to 49 (24). As shown in Fig.3, the region of 32P containing residues 72 to 116 is also predicted to contain three short regions of p sheet as well as several fl turns. The unequal charge distribution in 32P (the NHz-terminal half having a charge of +8, while the COOH-terminal half has a charge of -17) also supports the notion that the NH2-terminal domain of 32P is most important for DNA binding. Cleavage at lysine 253 with trypsin results in the removal of the “A” region (Fig. 1) which has been implicated in controlling the helix-destabilizing “activity” of 32P (25). 32P lacking

1758

Amino Acid Sequence of the T4 DNA-binding Protein

the A region (32P*-A) is retained by dsDNA cellulose (25) and can denature T4 &DNA (26),properties not exhibited by the intact protein. Since the affinity of 32P for ssDNA is not significantly enhanced by removal of the “A” region(21), there is no apparent thermodynamic explanation for the increased helix-destabilizing activity of 32P*-A as compared to 32P. Moise and Hosoda (25)suggested that interaction of the “A” region of 32P with other T4 DNA replication proteins might have the same effect as the actual removal of the “A” region and, therefore, limit the helix-destabilizing activity of 32P to only the small section of dsDNA just in front of the replication fork. Partial proteolysis (11) and preliminary NMR studies (2) are both consistent with the “A” regionbeing exposed and, hence, available for interactions with other proteins involved in DNA replication. In particular, Hosoda et al. (27) have observed that loss of the “A” region is accompanied by a loss of affinity of 32P for T4 DNA polymerase (43P) and for the T4 RNA primingprotein (61P). It is not yet apparent how the negative charge (-10) or the unusual cluster of serine residues in the “A” region (280 to 283 and 285 to 288) is related to the presumed functions of this domain of 32P in vivo. The gene 32 protein is essential to many aspects of T4 DNA metabolism, from replication to recombination and repair. Knowledge of the primary structure of this DNA-binding protein will complement information gained from studies of its functional domains and ultimately help to define the role of this helix-destabilizing protein in these diverse reactions. In addition, the amino acid sequence for 32P shown in Fig. 1 should allow more precise interpretation of studies to determine the molecular basis for the tight binding of 32P to =DNA. These protein:nucleic acid interactions may prove to be ones commonly used by other proteins binding to DNA (2). Acknowledgments-We are especially grateful to Dr. William Konigsberg for his advice and encouragement throughout this study. We also wish to thank Gary Davis for running the sequenator and the gas chromatograph and Dr. Louis Henderson for identifying some of the phenylthiohydantoin derivatives by high pressure liquid chromatography. We would like to acknowledge Dr. Yasutsugu Nakashima for assisting with the initial phase of this work and Kevin Behar for helping with the secondary structure predictions. REFERENCES 1. Alberts. B.. and Sternglanz, R. (1977) Nature 269,655-661

2. Coleman, J., and Oakley, J. (1979) Crit. Reo. Biochem. (7)3,247289 3. Breschkin, A. M., and Mosig, G. (1977) J. Mol. Bwl. 112,279-294 4. Breschkin, A. M., and Mosig, G . (1977) J.Mol. Biol. 112,295-308 5. Mosig, G., Berquist, W., and Bock, S. (1977) Genetics 86, 5-23 6. Mosig, G., Luder, A., Garcia, G., Dannenberg, R., and Bock, S. (1979) Cold Spring HarborSymp. Quant. Bwl. 43,501-515 7. Epstein, R. H., Bolle, A,, Steinberg, C. M., Kellenberger, E., Boy de la Tour, E., Chevalley, R., Edgar, R. S., Susman, M., Denhardt, G. B., and Lielausis, A. (1963) Cold Spring Harbor Symp. Quant. Biol. 28, 375-392 8. Wu, J-R., and Yeh, Y-C. (1973) J. Virol. 12, 758-765 9. Tomizawa, J., Anraku, N., and Iwama, Y. (1966) J.Mol. Bwl. 21, 247-253 10. Williams, K. R., LoPresti, M., Setoguchi, M., and Konigsberg, W. (1980) Proc. Natl. Acad. Sci. U. S. A., 77,4614-4617 11. Williams, K. R., and Konigsberg, W. (1978) J. Biol. Chem. 263, 2463-2470 12. Williams, K. R., Sillerud, L. O., Schafer, D. E., and Konigsberg, W. H. (1979) J. Bwl. Chem. 254,6426-6432 13. Alberts, B. M., and Frey, L. (1970) Nature 227,1313-1318 14. Carroll, R., Neet, K., and Goldthwait, D. A. (1975) J. Mol. Biol. 91,275-291 15. Hosoda, J., and Moise, H. (1978) J. BZoE. Chem. 253, 7547-7555 16. Anderson, R. A., and Coleman, J. E. (1975) Biochemistry 14, 5485-5491 17. Chou, P. Y., and Fasman, G . D. (1978) Adu. Enzymol. 47,45-148 18. Cohen, F. (1979) Doctoral thesis, Yale University 19. Mosig, G., and Breschkin, A. M. (1975) Proc. Natl. Acad. Sei. U. S. A. 72,1226-1230 20. Mosig, G., and Bock, S. (1976) J. Virol. 17,756-761 21. Spicer, E. K., Williams, K. R., and Konigsberg, W.H. (1979) J. Biol. Chem. 254,6433-6436 22. Jensen, D. E., Kelly, R. C., and von Hippel, P. H. (1976) J . Biol. Chem. 261,7215-7228 23. Anderson, R. A., Nakashima, Y . , and Coleman, J. E. (1975) Biochemistry 14,907-917 24. McPherson, A., Jurnak, F. A., Wang, A. H. J., Molineux, I., and Rich, A. (1979) J. Mol. Biol. 134,379-400 25. Moise, H., and Hosoda, J. (1976) Nature 259,455-458 26. Greve, J., Maestre, M.F., Moise,H., and Hosoda, J. (1978) Biochemistty 17,893-898 27. Hosoda, J., Burke, R. L., Moise, H., Tsugita, A., and Alberts, B. (1980) J. Supramol. Struct. 4, (suppl.) 340 28. Tomita, M., Furthmayr, H., and Marchesi, V. T. (1978) Biochemistry 17,4756-4770 29. Smithies, O., Gibson, D., Fanning, E. M., GooMiesh, R. M., Gilman, J . G., and Ballantyne, D. L. (1971) Biochemistry 10, 4912-4921 30. Tsugita, A., and Hosoda, J. (1978) J. Mol. Biol. 122,255-258 31. Hartley, B. S. (1970) Biochem. J. 119,805-822 32. Schroeder, W. A. (1972) Methods Enzymol. 25,214-221

Amino A(

w Nd

9

22 29

"he Ij"

w w _* A M

AAA /*A

HP.'

Amino Acid Sequence of the T4 DNA-binding Protein

Amino Acid

Sequence of the T4 DNA-binding Protein

FRlCTlON YUYBFR

1761

1762

Amino Acid Sequence of the T4 DNA-binding Protein

L: FRACTION NUMBER