The Structural Homology of Arnicyanin from Thiobacillus versutus to ...

2 downloads 0 Views 1MB Size Report
Sep 17, 1990 - Val Tyr Trp Val Asn Gly Glu Val Met Pro His t“ s9c1 ___I 1. S6. --5-1 ... ment of the protein with pyroglutamate aminopeptidase from calf liver. ...... mntatned two uncleaved GIu-X bonds, X being Lys in both cases. cleavage 01 ...
THEJOURNAL OF BIOLOGICAL CHEMISTRY

Vol. 266, No. 8, Issue of March 15, pp. 4869-4877,1991 Printed in U.S. A.

GI 1991 by The American Society for Biochemistry and Molecular Biology, Inc.

The Structural Homology of Arnicyanin fromThiobacillus versutus to Plant Plastocyanins” (Received for publication, September 17, 1990)

Jozef Van Beeurnen$& Stefaan Van Bun$, Gerard W. Cantersll, Arjen Lommenll, and CyrusChothiall From the $Laboratory of Microbiology and Microbial Genetics, State University of Ghent, Ghent 9000, Belgium, the Whemistry Department, Gorlaeus Laboratories, Leiden University, Leiden 2300-RA, The Netherlands, and theIlMedical Research Council, Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, United Kingdom

The complete amino acid sequence of the blue copper organism can also use a wide variety of organic substances protein amicyanin of Thiobacillus uersutus, induced and has therefore been classified as a mixotrophic Thiobacillus is grown on methylamine, hasbeen species (2). An interesting aspect of its metabolic activities is when the bacterium determined as follows: QDKITVTSEKPVAAADVPA- that T . uersutus can also grow onmethylamine,thereby DAVVVGIEKMKYLTPEVTIKAGETVYWVNGEVM- producing formaldehyde, with oxygen as a terminal electron PHNVAFKKGIVGEDAFRGEMMTKDQAYAITFNE- acceptor (3). This property has untilnow only been found in AGSYDYFCTPHPFMRGKVIVE. The four copper lithe facultative methylotroph Pseudomonas AM 1 (4) and in gand residues in this 106-residue-containing polypep- the obligate aerobic bacterium Paracoccus denitrificans ( 5 , 6). tide chainare HisS4,CysB3,His”, and Met”. The Thio- In all these casesa copper-containing protein, namedamicybacillus amicyanin is 5290 similar to the amicyaninof anin, appears tobe a n electron acceptor of the methylamine Pseudomonas AM1, the only othercopperprotein known with the same spacing between second the his- dehydrogenase which catalyzes the formaldehyde production tidine ligand and the methionine ligand. T. versutus (3-6). The primary structureof the Pseudomonas amicyanin has amicyanin contains no cysteine bridge and is more been determined a few years ago ( 7 ) . On the basis of this closely related to the plant copper protein plastocyanin than to the bacterial copper protein azurin. Alignment structure the amicyaninwas proposed to be the first example of the two known amicyanin sequences with thecon- of a new class of small blue copper proteins distinctfrom the found in sensus sequence of the plastocyanins and comparison azurinsfound in bacteriaandtheplastocyanins with the known three-dimensionalstructure of poplar nearly all chloropast-containing eukaryotes, including some leaves plastocyanin revealsthat the bacterial proteins blue-green bacteria. In the present paper we report the complete aminoacid sequence of the amicyaninfrom T. uersutus, havethesameoverallstructurewithtwo&sheets packed face to face. The major structural differences and we point toevidence that the three-dimensional structure between the amicyanins and the plastocyanins appearof the amicyanins is more related to the plant plastocyanins t o be located in two of the five loops that connect the than to the azurins.A crystallographic analysis of the threesix identified &strandsof the amicyanins. The first of dimensional structure of T. uersutus amicyanin is under way these two loops, connecting strands F and G, contains (8) as well as thatof the methylamine dehydrogenase (9). a ligand histidine and must have a different conformation fromthe sameloop in the plastocyanins because EXPERIMENTAL PROCEDURES’ it is shorter by two amino acids. Further differences occur in theloop connecting the strandsD and E. This RESULTSANDDISCUSSION loop contains only 17 residues in amicyanin whereas The complete amino acid sequenceof T. uersutus amicyanin the corresponding loop of plastocyanin contains 25 residues. Despite these differencesthe amicyanins ap- is shown in Fig. 1. All the quantitative data including amino pear much closer related to the plastocyanins than to acid composition and yields of phenylthiohydantoin amino acids are given in the Miniprint Section. Sequence analysis the azurins. The present findings demonstrate that the difficult suggestingthat theNoccurrence of blue copper proteins with clearly plas- of the native protein appeared tocyanin-like features is not restricted to photosyn- terminal residue was not present with a free amino group. thetic redoxchains. The firstsequence data were obtained from peptides resulting from cleavage of the copper-free carboxymethylated protein with endoproteinase Glu-C from Staphylococcus aureus protease. The separation of these peptides is given in Fig. 2. A Thiobacillus uersutus is a chemolithotrophic bacterium ca- major peak, S9, supposedly contains the N-terminal blocked pable of using sulfur or reduced sulfur compounds as a source peptide as it did not react with phenylisothiocyanate. Extenof reducing equivalents for its energy metabolism (1). The sion of the sequence information from the S. aureus protease peptides (but only two overlaps between them) was obtained *This work was supported by the Belgian NationalIncentive adigest of the carboxymethylated Program on Fundamental Research in theLife Sciences, initiated by at firstinstancefrom the Belgian State-Prime Minister’s Office-Science Policy Program- protein with endoprotease Arg-C from submaxillaris glands. ming Department (Contract BI022). The costs of publication of this The peptides resulting from this digest are shown in Fig. 3. article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. 5 To whom correspondence should he addressed Lab. voor Microbiologie, RijksuniversiteitGhent, K.L. Ledeganckstraat 35, Ghent 9000, Belgium. Tel. 32-91-645109; Fax: 32-91-645346.

Portions of this paper (including “Experimental Procedures,” part of “Results,” TablesS.1-S.VI, and Fig. S.1) are presented in miniprint at the end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are included in the microfilm edition of the Journal that available is from WaverlyPress.

4869

Homology of Thiobacillus Amicyanin to Plastocyanin

4870

10 20 pGlu Asp Lys Ile Thr Val Thr Ser Glu Lys Pro V a l A l a Ala Ala Asp V a l Pro Ala Asp Ala Val Val Val Gly Ile Glu

I H2

-

-

1

7



-





i

7

..

R14

30 40 50 Lys Met Lys Tyr Leu Thr ProG l u V a l Thr Ile Lys Ala Gly Glu ThrVal Tyr Trp V a l A s n Gly Glu Val Met Pro His

t“

s9c1

___I 1

-

S6

7 - 7 “ - 7 ” ” 7 7 ” 7 7 7 7

7 “ “

H11

- * + + ” + ” + * + + + ” + ” * 1-5--

7

7 7 ” ” ” -

R6 7

7

7

7

7

7

7

7

R7

-

7

7

7

-

7

60 70 80 Asn Val Ala Phe Lys Lys Gly IleV a l Gly Glu Asp Ala PheArg Gly Glu Met Met Thr Lys AspGln Ala Tyr Ala Ile I1 7

7

7

7

7

7

7

7

7

7

?

-

S8

I 1

7



7





I

I 7 - 7 ” - 7 ” -

90 100 Thr Phe Asn Glu Ala Gly Ser Tyr Asp Tyr Phe Cys Thr Pro His Pro Phe Met Arg Gly Lys V a l Ile Val Glu I 1 7

7

7

7

7

7

7

7

7



L H7

______(

” ”

7

7

7

7

7

R15A -

7

7

7

7

7

7

7

7

7

i 7

7

7

-

7

w

7

R

7

2

7



I

-

7 7 7 ” ”

FIG. 1. Amino acid sequence of amicyanin from T.versutus. The notations S, H, and R refer to peptides obtained after cleavage of the carboxymethylated protein with S. aureus V8 protease, dilute formic acid, and proteinase Arg-C, respectively. SSCI is a chymotryptic subdigest peptide of S9. -, sequence evidence obtained from runs on a 470A gas-phase sequenator with off-line phenylthiohydantoinanalysis: +, evidence obtained from automated sequence analysis on a 477A pulsed-liquid sequenator with on-line phenylthiohydantoin analysis. 1, indicates a deliberate abortion of the sequence runs.

The overlapping sequences were from peptides R15A (between S1 and S5) and R4 (between S6 and S8). Further sequence evidence was obtained from peptides resulting from treatment of the carboxymethylatedprotein with 2.5% formic acid. The separation of the peptides is given in Fig. 4. The major peptide H11 extended the sequence information at the N-terminal side of peptide S6 by 15 residues. Two other peptides, H7 and H2, were important in that they provided sequence evidence for the C-terminal and the N-terminalregion of the protein, respectively. Peptide H7 also contained the only cysteine residue of the protein and confirmed the position of carboxymethylcysteine found at cycle 7 of peptide S5. TheNterminal sequence of the amicyanin was obtained after treatment of the protein with pyroglutamate aminopeptidase from calf liver. The sequence run was stopped after a 5 residue overlap with peptide H11 was found. The specificity of the enzyme used suggests that the N-terminal residue of the amicyanin is pyroglutamic acid. S9C1 is a subdigest peptide of S9 (Fig. 5 ) . Confirmation of the C-terminal sequence by treatment with carboxypeptidase Y and P was successful in

the sense that the amino acids that can be expected from the proposed sequence, up to PheS9,were released by the peptidase. The amino acid composition calculated from the total sequence as given in Fig. 1 is in agreement with the one experimentally determined after total acid hydrolysis (Table 1). The primary sequences of the amicyanins from T. uersutus and Pseudomonas AM1 are compared with each other in Fig. 6, which also contains the sequences of a number of plant plastocyanins and aplastocyanin consensus sequence, in which the secondary structure elements have been indicated. T. uersutus amicyanin is 106 residues long whereas the protein from Pseudomonas AM1 is shorter by 7 residues. Comparison of the sequences leaves no doubt that theproteins are homolI extra residues of the Thiobacillus protein ogous and that the precede the N-terminalregion of the Pseudomonas amicyanin. 50 residues in the two sequences are identical out of 99 residues in the Pseudomonas AM1 and 106 in the T. uersutus sequence. When also conservative substitutions of the type (Phe, Trp, Tyr), (Asp, Glu), (Ile, Leu, Met, Val), and (Ser,

Homology of Thiobacillus Amicyanin to Plastocyanin

4871

TABLEI Amino acid composition of the amicyanin Values in parentheses indicatethe hydrolysis timeof each sample. The amounts given at the bottom refer to the number of nanomoles used for the hydrolyses. The analysis of tne carboxymethylated protein (CM) was carried out by precolumn derivatization (see "Experimental Procedures").

0.6"

CY8

ASP Asn Thr Ser FIG.2. HPLC separation pattern of the S. aweus protease digest of carboxymethylated amicyanin. The ordinate on the right indicates the gradient with acetonitrile. Dataon the numbered peptides are to be found in the Miniprint Section. AU, absorbance unit. Wl

I % AC N 100

50

0

Glu

Gln Pro G~Y

Ala

1 6 3 8 2 9 2 6 8 11 14 5 6 1 5 5 2 9 1 2

10.2

8.4 9.2 8.9 7.3 2.3 11.3

6.9 2.4 11.4

8.4 2.0 10.4

6.7 1.6 11.7

5.8 8.7 10.8 12.9 3.1 5.7 1.3 4.8 5.0 1.9 8.6

5.7 8.6 11.4 13.8 2.9 6.2 1.2 4.6 4.3 1.9 9.0

5.7 7.7 9.8 10.4 3.1b 5.2 1.1 3.1 5.0 1.9 9.0

5.6 7.5 10.0 12.7 4.6 5.9 0.9 4.9 5.0 2.0 11.0

Val Met Ile Leu TYr P he His LYS Trp 2.32.32.2 '4% Amount (nmol) 1.26 1.25 1.29 0.16 Total As cysteic acid. * As methionine sulfone.

2.0

106

."

timelminj

FIG.3. HPLC separation profileof the amicyanin generated after cleavage with Arg-C endopeptidase. For detailed information see the Miniprint Section.AU, absorbance unit.

%ACN LU

100

%ACN 50 100

50

tmelminl

FIG.5. Gradient elution profile of the two peptides gener"

0

.

I

100

50

0

ttmelmlnl

FIG.4. HPLC separation profile of peptides generated by partial acid hydrolysis of the amicyanin. See the Miniprint Section for details. AU, absorbance unit.

ated by cleavage of the N-terminally blocked peptide S9 with chymotrypsin. The peptides A and B are referred to as S9C1 and S9C2,respectively, in the text and in Table S.111. AU, absorbance unit.

cyanin sequence is homologous to Hiss7 of the plastocyanins. The homologous in the azurin sequences (15) is not titratable. In addition to these two observations, the similarity Thr) are included the homology increases to even 57 similar in chain lengths of the amicyanins and the plastocyanins (99residues. This high similarity is somewhat surprising given 106 and 99-104 residues,respectively) constitutes a third the rather diverse biochemical properties displayed by both argument to compare the amicyanin with the plastocyanin strains, apart from the common ability to oxidize methyla- data. Azurins are at least some 20 residues longer than the mine (2, 14). plastocyanins. Yet, it may be worthwhile to point out that Thereasontocomparetheamicyanin sequences of T. although the amicyanin structure will appear to map well onto uersutus and Pseudomonas AM1 to thoseof the plastocyanins the plastocyanin structure(vide infra) the partof the amicyand not to the azurins, despite the fact that the latter are of anin structure consisting of the /?-sheets maps nicely alsoonto prokaryotic and the former of eukaryotic origin, is 2-fold. the azurin structure. This need not surprise in view of the First, there is no S-S bridge in the amicyanin and plastocy- similar P-sheet structures of plastocyanin and azurin(19-21). anin structures as opposed to the Cys3-CysZ6S-S bridge in The primarysequences of 25 plastocyanins have been taken the azurins (15). Second, plastocyanin (16) and T. uersutus from the literature andhave been alignedin Fig. 6. A consenamicyanin (17,18) contain a titratable histidine copper ligand. sus sequence was constructed by placing at each position the This histidine residue a t position 96 in the T.uersutus ami- residue with the highest occurrence. Residues that are con-

Homology of Thiobacillus Amicyanin to Plastocyanin

4872

2

1

CULC P A22404 CWF A CUSP CUED SO0210 CWM CUKV

K

cusu A

D

CUFB CUDM CUP0 CUUA

cum JA0065 CUPX SO0206 RICE PARS CARROT SCOB CUKL A25055 ULAR CUAI

3

4

6

5

R

7

9

9

1 5 0 5 0 5 0 5 0 5 0 5 0 5 0 5 0 5 0 5 9 IIIIIIIIIIIIIIIIIlIIIIlIIIIIIIlIIIIIIl11111111/l1ll111lIIlIIIIIIIIlI IIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AEVLLGSSDGGLVFEPSTFSVASGEKIVFKNNAGFPHNWFDEDEIPAGVDASKISMSEEDLLNAPGE---TYAVTLTEKGTYSFYCAPHOGAGMVGKVTVN VT D L 1 A VV T E S V AVNSESADT S A P SKDA K S Q E S A V G G S A L G D K K S V I V GE T S AP 1 N S SA D S S K S VDAD PA I S AVE K L AVS V FEASDET S S S I GD S A I N D A S G N V K N I 1 GD S A V N N T T S SG B Z V Z Z Z S S GGI S A V N DI K S D N A E A A L V EGV S S P P E V DT D LD E A V N NT P S D A s s LD GD S A I GN SA T A A S S T I P GN ISA T S s s I I K GD A A V G S T A V I S S Q M D ES TA V K GA S TA D E P S G L K IDAD S AVEISP I K SN s s E FE A S KA TT QD ANG ND V S T --Q EYV AVY FS A E GVP Q A N NG DVTT TK V--QEY AV S E A FS GVP K D --QPEY GA E S NS T V AI T E EK A K E S SSFT DGAK YG --Q S V V I K A D A EK A K E N DAL --HD Y ADS KNVTIKA AA DSVTWT AV I S TAKFDTA E GYF E I Q DVT K ADS A SVTIKA TVTWV S AN EAL --H Y S SAKFDTA GYF E VI K TI Q A I K GD S A V N N I T G A S E I KMTI Q I AV DA --A SK QY W R K TP GV D S A Q I K GD A A V KI A A E V AV DA I --YDSK Y W R KKMTI ASTP GV EGV ETYTK DK LAKLTIKP DTVE LKVP AALNKSADLAKLHKQMSQSTSTFPADAPAETE R I AG llllIIIIIIIIIIIIIII11111111l1l11lIIIIIlIIIIlI/llllIllIIIIIlIIIIIlIlI IIIIIIIIIIIIIIIIIIIIIIIIIJIIIII 1 2 5 3 4 6 7 8 9 9 1 5 0 5 0 5 0 5 0 5 0 5 0 5 0 5 0 5 0 5 9

B

A

t

**

B’

” ”

t*

* * *

t*t

D C + +t**tttt**t**tt

E

*

F

“ “

ttt

+t*-

*t

*ttt

*

tt*

**

****tt

G

++*

1 2 3 4 5 6 7 9 8 9 1 5 0 5 0 5 0 5 0 5 0 5 0 5 0 5 0 5 0 5 9 11l1/11111lIIII/III111lIIIIIIIIIlIIIlllIIIIII/lI/lI/lllIIIlIIIIIIIIlIllllllllllllllllllllllllllllll Cons. (AEVLLGGDDGSL~FVPSNFSVAAGEKITFKNNAGFPHNWF~DEVPSGVDASKISMSEEDLLNAPJGETYSVTLTEKGTYSFYCSPHQGAGMVGKVTVN AM1 AGALEAVQEAPAGSTEVKIAKMKFQTPEVRIKAGSAVTWTNTEALPHNVHFK-----SGPGVEKD---VEGPMLRSNQTYSVKFNAPGTYDYICTPHP--F~GK~E Tv QDKITVTSEKPVAAADVPADAVWGIEKMKYLTPEVTIKAGETVYWVNGEVMPHNVAFK-----KGIVGEDA---FRGE~TKDQAYAITFNEAGSYDYFCTPHP--FMRGKVIVE IlIIIIl111ll111lIII11lIIIIIIIIIIIllIIIIIIlI/llIllIIIIIIIIlI OIIIIII 111111ll11111lIIIIllllllIIIIII IIIIIIIII 1 2 5 3 4 6 7 8 9 1 1 1 5 0 5 0 5 0 5 0 5 0 5 0 0 5 0 5 0 5 0 0 5 0 6

FIG. 6. Alignment of plastocyanin sequences with those of T. versutus and Pseudomonas AM (7) amicyanin. Deletions have been indicated by minus signs in the sequences. The plastocyanin codes refer to:

CULC, Lactuca sativa (garden lettuce); A24494, Silene pratensis, Lychnis alba (white campion, evening lychnis); CUVF, Vicia faba (broad bean); C U S P , Spinacia oleracea (spinach); CUED, Sambucus nigra (European elder); S00210, Populus nigra italica (plastocyanin a, Lombardy poplar); CUVM, Cucurbitapepo var. medullosa (vegetable marrow); CUKV, Cucumis satiuus (cucumber); CUSU, Capsella bursa-pastoris (Shepherd’spurse); CUFB, Phaseolus vulgaris (kidney bean); CUDM, Mercurialis perennis (dog’s mercury); CUPO, Solanum tuberosum (potato); CUUA, Solanum crispum (Chilean potato tree); CURX, Rumex obtusifolius (bitter dock); JA0056, Arabidopsis thaliana (mouse ear cress); CUPX, Populus nigra italica (plastocyanin b, Lombardy poplar); S000206, Hordeum uulgare (barley); CUKL, Chlorellafusca; A250.55, Enteromorphaprolifera; ULAR,Ulva arasakii; CUAI, Anabaenauariabilis. The first sequence (CULC) has been presented in full; of the other sequences only those residues are presented that are different from the top sequence. Sequence information was derived from the EMBL Data Library “SwissProt” Protein Sequence Data Base, Heidelberg. Sequences of RIC, PARS (parsley), CARROT and SCOB ( S . obliquus) were taken from: Yano, et al. ( 2 2 ) (RICE); R. P. Ambler and A. G. Sykes, unpublished results, cited by Sykes (23) (SCOB); McGinnis et al. (24) (PARS, SCOB); Shoji, et al. (25) (CARROT). A plastocyanin consensus sequence (Cons.), constructed on the basis of the 25 plastocyanin sequences listed here (see text), is compared with the sequences of Pseudomonas AM1 (AM1) and T. versutus (Tu) amicyanin at the bottom of the figure. The 8 strands which make up the @-barrel structureof plastocyanin (A, B, C, and E constituting 8-sheet I, and B’, D, F, and G constituting @-sheet11) are shown as bars above the plastocyanin consensus sequence. The asterisks above the consensus sequence denote fully conserved residues, the plus signs denote residues that occur in 20-24 out of the 25 sequences listed, and the minus sign (position 57) denotes a residue that is conserved in 16 out of 17 sequences (the remaining 8 sequences carry a deletion at this position). Italicized characters between parentheses in the plastocyanin consensus sequence denote stretches where the alignment with the amicyanin sequences is arbitrary.

served in all sequences have been marked by an asterisk, and Fig. 7 (modified from Refs. 19 and 20). The second strand those that occur in 20-24 sequences have been marked by a actually is split in two parts (B and B’) which belongs to The three-dimensional structure of plastocyanin from different sheets. One of the copper ligands is located in loop poplar leaves is known(19,20), and the fundamental featuresCD (His”); the three others are located on loop FG and the of this structuremay safely be assumed to be conserved among rim of sheet I1 at the ends of strands F and G (Cys%,Hiss7, the plastocyanins in Fig. 6 given the high similarity between and Metg2). The alignment of the amicyanin sequences from Thiobacilthe quoted sequences. The secondary structure of plastocyanin consists of 7 @-strands,labeled A-G, which are grouped lus versutus and Pseudomonas AM1 with the plastocyanin together in two @-sheets,labeled I and 11, which pack face to consensus is given at thebottom of Fig. 6. There is noproblem face. The strands have been indicated in Fig. 6 by bars above finding the correct alignment in the region comprising the the consensus sequence. The folding topology is illustrated in strands E-G of the plastocyanin structure, since 17 out of the

“+.”

Homology of Thiobacillus Amicyanin to Plastocyanin

4873

i5

b2

I

II

FIG. 7. Folding topology of plastocyanin (after Refs. 19 and 20).The two P-sheets are denoted by I and 11, the &strands by the letters A-G (see also Fig. 6). The small circles symbolize the ligands of the copper, denoted by Cu. The cylinder symbolizes the quasihelical content of the loop connecting strands D and E.

31 residues that constitute this region, including the copper ligands at positions 84, 87, and 92 in the plastocyanin sequence (positions 93,96, and99 in the T. versutus amicyanin sequence) are identical to residues in the corresponding region of the amicyanin frome.g. T. versutus. Likewise, when taking into account the pattern of hydrophobic,hydrophilic, and neutral residues in the sequences, aligning the plastocyanin region running from strand B up to and including strand D with the amicyanin sequence presents little problems. Little homology between plastocyanin and amicyanin, however, is detectable for the loop connecting strands D and E and for the region preceding strand B. Apparently these regions are not structurally conserved as evidenced by differences in loop lengths, and they have been excluded from the alignment. The corresponding regions in the plastocyanin sequence at and put between the bottom of Fig. 6havebeenitalicized parentheses. The overall homology between plastocyanin, and T. versutus and Pseudomonas AM1 amicyanin amounts to25 and 31%,respectively, expressed as a percentage of the total number of plastocyanin residues, and counting the differences between the DE loops as one mutational count. When conservative substitutions are also taken into account the homologies increase to 34 and 36%,respectively. The analysisof the amicyanin secondary structure is done by checking if residues that are determinants of the plastocyanin secondary structure can be recognized in the amicyanin sequence. The topology of the two P-sheets of plastocyanin with details on the conserved or semi-conserved nature of the residues is presented inFig. 8 (top). Residues that are critical for maintaining the integrityof the two sheets are the ones that point to the interior of the P-barrel.A way to distinguishthemisto look at the accessible surfacearea (ASA).' Residues with ASAs 525 A2 (21) have been indicated by heavy circles in Fig. 8 (top). Position 70 is occupied in 24 out of 25 sequences by a hydrophobic or aromatic resjdue (F, Y, or V), although the ASA of this residue is >25 A2. This position is therefore indicated by a broken circle in Fig. 8. Not surprisingly, as pointed out by Chothia and Lesk (21), the majority of the conserved residues in plastocyanin are found at positions whereburiedresidues occur. It is known that sheet I1 is moreconserved than sheet I in the small blue copper proteins (21), and this is reflected in the present case by the amount of conserved and semi-conserved residues in the two sheets in plastocyanin (65 uersu.s 56%, respectively). The topology of the plastocyanin @-sheets has been superimposed on the primary T. versutus amicyanin structure in

' The abbreviations used are: ASA, accessible surface area; HPLC, high pressure liquid chromatography.

2' 7L

25

E

C

"'0 99

A

a

8'

G

F

D

n 93

3L

99

$2

32 59

03

L2

E

C

106

A

I

a

a'

G

F

n

D

FIG. 8. Comparison of @-sheetstructures of plastocyanin (top) and T. versutus amicyanin (bottom).Plastocyanin structure (19-21): asterisks and plus signs denote conserved and semiconserved residues, respectiyely, as inFig. 6. Heavy circles denote buried residues (ASA 5 25 A). The broken heauy circle (position 70) denotes a hydrophobic residue with an ASA >25 A. Broken lines indicate hydrogen bridges. Amicyanin structure: asterisks denote residues identical to absolutely conserved plastocyanin residues, plus signs denote residues identical to semi-conserved plastocyanin residues or residues that are semi-conserved compared with conserved or semi-conserved positions in the plastocyanin consensus sequence. The plastocyanin pattern of heavy circles and one broken circle was superimposed on the amicyanin structure according to the alignment presented in Fig. 6, except for position 57 in the amicyanin sequence where the hydrophobic character of the residue is not conserved (see text). Thearrows point to differences with the plastocyanin structure (see text).

Fig. 8 (bottom) by using the alignment of the primary structures in Fig. 6. Since, as mentionedabove, strand A could not be identified in the amicyaninsequence, the positionsbelonging to this strand have been left blank in Fig. 8 (bottom). The buried character of the residues was assumed tobe maintained when buried residues in the plastocyanin structure were conserved or conservatively replaced in the amicyanin sequence (heavy circles in Fig. 8). By analogy with position 70 (Tyr) in plastocyanin, position 70 (Tyr) in the amicyanin sequence has been indicated by a brokencircle (sheet I). Moreover,

4874

Homology of Thiobacillus Amicyanin

position 74 in plastocyanin, which is occupied by a buried leucine in 22 and by phenylalanine in 2 out of 25 sequences, is occupied by phenylalanine in the homologous position 83 of the amicyanin sequence. It is assumed to be buried in amicyanin as well. On similar grounds the buried character of the residue at position 36 in the amicyanin sequence has been assumed to be maintained. As a general statement it canbe said ( a ) that thealignment as proposed in Fig. 6 results in a strong homology of amicyanin residues with residues that are determinants of the @-sheet structure of plastocyanin, ( b ) that thepositions of hydrophobic residues have been conserved in the amicyanin sequence, and (c) that strongly conserved residues in the plastocyanin sequence are found at similar positionsin the amicyanin sequence. Of the four ‘I*” and the seven residues in sheet I of plastocyanin (not counting strand A), one * and six residues are found in the amicyanin structure. For sheet I1 these numbers are seven * and eight in plastocyanin, and six * and five in amicyanin. The positions where clear differences between plastocyanin and amicyanin occur have been marked by arrows in Fig. 8. In sheet I they are found at residues 47 (30), 49 (32), 76 (67), and 83 (74), plastocyanin sequence numbers being noted in parentheses. Position 47 (30) is semi-conserved in plastocyanin, but the residues found in the two amicyanin sequences at this position (Thr and Val) occasionally occur also in plastocyanin. Similarly the residue found at position 83 occasionally occurs at the (semi-conserved) homologous plastocyanin position (74). The real differences between plastocyanin and amicyanin seem to occur at positions 49 (32) and 76 (67). They are located at the rim of sheet I. For sheet 11 differences are restricted to positions 57 (40), 59 (42), 100 (85), and 104 (97). Again, only two of these are significant, namely those at positions 57 (40) and 59 (42). Also here they appear to be located on the rim of the sheet. Although the @-sheets appear to be the fundamental secondary structure elements of the plastocyanins and the amicyanins, it is of interest to look at theloops connecting the (3strands. Because of the uncertain identity of the A-strand the AB loop could not be identified. The connection between strands B and B’ is formed by a proline inplastocyanin (Pro16). In the alignment of Fig. 6 a proline occurs in the amicyanin sequences at the transitionfrom sheet B to B’ but shifted by one position. If the plastocyanin/amicyanin alignment is correct, it is likely, therefore, that theBB’ connection has a somewhat different conformation in amicyanin than in plastocyanin. The plastocyanin consensus for loopB’C is AAG. The glycine in this loop has an unusual configuration (4 = 78”, \Ir = 5 ” ) (21), which allows for a tight turn at this point and which is therefore conserved in all plastocyanins. It is found back in both amicyanins as the 3rd residue of the KAG sequence in the BC’ loop in Fig. 6, indicating that the configuration of the BC‘ loop might be similar in plastocyanin and amicyanin. Loop CD is heavily conserved in plastocyanin and has the consensus sequence AGFPHNV. Its conserved character is explained by the crucial role this loop fulfills in maintaining the integrity of the copper-binding site. The histidine in this loop is a copper ligand, and the asparagine following it makes H bonds with the backbone around the copper site and with the sulfur of the ligand cysteine (19, 20). Moreover, the last 3 residues of the loop are buried, and packing considerations require their being conserved in the plastocyanins. The two amicyanin sequences in this region are EALPHNVand EVMPHNV, showing a complete identity at the last four positions with the plastocyanin sequence. In T. versutus amicyanin the histidine in this loop has indeed

“+”

+

+

+

to Plastocyanin

been shown to be a ligand to the copper (17, 18).The configuration of this loop is therefore probably conserved also in the amicyanins. The long 25-residue-containing DE loop in plastocyanin includes the only helix-like structure of plastocyanin. The alignment in Fig. 6 shows eight deletions in this loop in the amicyanins. We have given above the arguments to support the idea that these deletions can be divided into two deletions of 5 and 3 residues, respectively. It may be expected that the overall conformation of this loop will be different from the plastocyanins. Loop EF in plastocyanincontains only solventaccessible residues. Its characteristic feature seems to be its shortness (3 residues) which is maintained in theamicyanins. Loop FG, finally, is again strongly conserved in the plastocyanins. It contains the second His ligand of the copper. Moreover, the bordering residues on strands F and G, Cysa and Metg2,are also strongly conserved as copper ligands. The plastocyanin consensus sequence is SPHQGAG, the residues of which are all solvent-accessible. In both amicyanins the FG loop has the sequence TPHPF. The bordering residues are identical to those in plastocyanin (Cys and Met). They have been shown, as in plastocyanin, to be ligands to the copper, and likewise the Hisin the FG loop of amicyanin is a copper ligand. While the first 3 residues in the amicyanin FG loop show a strong similarity with the plastocyanin consensus, an important difference is the deletion of 2 residues at the end of the amicyanin loop. The FG loop must therefore have a different conformation in amicyanin from the one in plastocyanin. It will be interesting to see how with three of the four copper ligands squeezed in on such a short polypeptide the copper can still be adequately bound by four ligands in amicyanin. In this connection, it is interesting to note that the differences in the lengths of loops DE and FG, and the sequence nonidentities in strand D (amicyanin residues 57 and 59) and at the ends of E and C (amicyanin residues 49 and 76) may be interrelated (see Fig. 8). Together,these differences indicate thatthestructure of the end of the molecule containing the metal-bindingsite may be rather different from either plastocyanin or azurin. The functional similarity to plastocyanin with regard to histidine titration and pH effect on redox potential is therefore all the more interesting. Further details concerning the active site structure must await the outcome of structural studies. CONCLUSION

In conclusion, alignment of the amicyanin sequence with the plastocyanin consensus sequence in the manner of Fig. 6 allows the recognition of a number of secondary structural features of plastocyanin. Six out of seven (3-strands could be identified. The structure of the (3-sheets I and I1 appears to be conserved. Differences from plastocyanin seem restricted to the rims of the sheets. Sheet I1 shows a better homology with plastocyaninthan sheetI, in line with the general pattern observed for blue copper proteins that sheet 11 always seem to be better conserved than sheet I (21). The variations in side chain volumes due to changes in composition of sheet I are usually accommodated for by slight repacking and repositioning of the sheets with respect to each other. Amicyanin probably fits this general trend. Of the five loops that connect the six @-strandsidentified in amicyanin, the loops B‘C, CD, and EFmay have conformations similar to plastocyanin. Loop FG which contains a ligand histidine in the middle and the two S ligands of the copper (Cys and Met) at the endwhere it joins strands F and G must have a different conformation from plastocyanin. The same applies to loop DE which is much shorter in the amicyanins than in plastocyanin. The 29

Homology of Thiobacillus Amicyaninto Plastocyanin

men, J., and Canters, G. W. (1988) J . Mol. Biol. 199, 545-546 9. Vellieux, F. M. D., Huitema, F., Groendijk, H., Kalk,K. H., J. A,, Duine, J. A,, Petratos,K., Frank,Jzn.,J.,Jongejan, Drenth, J., and Hol., W. G. J . (1989) E M B O J. 8, 2171-2178 10. Crestfield, A.M., Stein, W. H., and Moore, S. (1963) J. Biol. Chem. 238, 2413-2420 11. Jnglis, A. S. (1983) Methods Enzymol. 91, 324-332 12. Hunkapiller, M. W., and Hood, L. E. (1983) Methods Enzymol. 91,486-493 13. VanBeeumen, J. (1986) in AdvancedMethodsinProteinSequence Analysis(Wittmann-Liehold, B., Salnikow, J., and Erdmann, Y. A., eds) pp. 256-264, Springer-Verlag, Berlin 14. Jenkins, O., Byron, D., and Jones, D. (1984) in Microbial Growth on C1 Compounds (Crawford, R. L., and Harrison, R. S., eds) pp. 255-261, American Society for Microbiology, Washington 15. Ryden, L. (1984) in Copper Proteins and Evolution of the Small Blue Proteins (Lontie, R., ed) pp. 157-182, CRC Press Inc., Boca Raton, FL 16. Katoh, S., Shirotori, I., andTakamija, S. (1962) J . Biochem. (Tokyo) 51, 32-40 17. Lommen, A., Canters, G. W., and van Beeumen, J. (1988) Eur. J. Biochem. 176,213-223 18. Lommen, A., andCanters, G. W. (1990) J. Biol. Chem. 265, 2768-2774 19. Colman, P. M., Freeman, H. C., Guss, J. M., Murata, M., Norris, V. A., Ramshaw, J. A. M., and Venkatappa, M.P. (1978) Nature 272,319-324 20. Guss, J. M., and Freeman, H. C. (1983) J . Mol. Biol. 169, 521563 21. Chothia, C., and Lesk, A. M. (1982) J . Mol. Biol. 160, 309-323 22. Yano, H., Kamo, M., Tsugita, A,, Aco, K., and Nozu, Y. (1989) Protein Sequence 8 Data Anal. 2, 385-389 23. Sykes, A. G. (1985) Chem. Soc. Rev. 14, 283-315 24. McGinnis, J., Sinclair-Day, J. D., Sykes, A. G., Powls, R., Moore, J., and Wright, P. E. (1988) Znorg. Chem. 27, 2306-2312 25. Shoji, A,, Yoshizaki, F., Karahashi, A,, Sugimura, Y., and Shimokoviyama, M. (1985) Seikagaku 57, 1036

residues that precede strand B in T. versutus amicyanin ( 2 2 residues in Pseudomonas AM1) could not be aligned properly with the plastocyanin consensus. Very likely they contain a p-strand comparable to strand A of plastocyanin since this strand is central to sheet I in the plant protein. The correct assignment of strand A, loop AB, and the N-terminal tailof amicyanin is difficult to make, but the differences in length and in sequenceimply furtherstructural differences with respect to plastocyanin. Despite the differencesbetween the amicyanins and the plastocyanins, noted here, it is clear that there is a much closer resemblance of the amicyanins with the plastocyanins than with the azurins. A similar conclusion has been reached before (17, 18) on the basis of a study of the effect of pH on the structure and redox activity of amicyanin from T. uersutus. The present results confirm that the occurrence of blue copperproteins with plastocyanin-likefeaturesisnotrestricted toredox chains of photosynthetic organisms. REFERENCES 1. Taylor, B. F., and Hoare, D. S. (1969) J. Bacteriol. 100, 487-497 2. Harrison, A. P. (1983) Znt. J. Syst. Bacteriol. 33, 211-217 3. van Houwelingen, T., Canters, G. W., Stobbelaar, G., Duine, J. A., Frank, Jzn., J., and Tsugita, A. (1985) Eur. J. Biochem. 153,75-80 4. Tohari, J., and Harada, Y. (1981) Biochem. Biophys. Res. Commun. 101,502-508 5. Husain, M., and Davidson, V. L.(1985) J . Biol. Chem. 260, 14626-14629 6. Husain, M., and Davidson, V. L. (1986) Biochemistry 25, 24312436 7. Ambler, R. P., and Tobari, J. (1985) Biochem. J . 232, 451-457 8. Petratos, K., Dauter, Z., Wilson, K. S., Lommen, A,, Van Beeu-

SUPPLEMENTAL MATERIAL TO.

THE STRUCTURAL HOMOLOGY OF AMlCYANlN FROM THlOBAClLLUS VERSUTUS TO PLANT PLASTOCYANINS by

JOZEF VAN BEEUMEN, STEFAAN VAN BUN GERARD W. CANTERS, ARJEN LOMMEN AND CYRUi CHOTHIA

METHODS The sranina oreoaration of Ihe amicvanin: The amicyanln was prepared according to the pmcedure described in [3].The sequence studies were s t a n d from 750 nanomoles of protem SDS-PAGE revealed a molewlar weight of 13.8 kD. Preliminarv modilcation of the natwe nmterns: The amicyanln was freed from copper according to 131. Theresultmg apopmtein was carboxymefhylated wlth lodoacetlc add according to Crestfield [lo] aner reduction wlth a 10-fold excess of dilh~othreitol. Salts were removed by gel filtration on a column of Sephadex G-25 flne (22x1,s cm) and elution with 5% formic add. The modified protein was lyophlllzed but could easily be redissolved in douMy distilled water pnor Io being digested. Enzb'matic and chemical deaveoes: Digestion of 100 nanomoles of the carboxymefhylated protein was c a r r i e d out In 70 mM ammonwm acetate pH 4.0. duting 4 hours at 3PC with an E(nzyme)lS(ubstrate) ratlo of 1140 The eniyme was from Miles (Slough. UK). A deavage with fhe Arg-C protease from Submaxillarls glands (Boehringer, BRD) was likewise performed On 100 nanomoles of the detivalised proleon at an EIS ratio 01 1140 at pH 8 4 I" 60 mMammmum bicarbonate and during 1 hour at 37% The chymotryptic dlgesl on peptide S9 was carried out on 10 nanomoles of peplide in the same bicarbonate buffer using chymotrypsin Supplemented with Soybean trypsin inhibitor. The digestion m e was 3 hours at 3PC. The panid acid hydrolysis of tne amicyanin was carried out on 50 nanomoles of protein m r d i n a to lnalis 1111 bvaddino 2W d of 2.5% formic acld lo lyophilized protein.The cleivage was.pe;fo;med atilo6"C' dunng 4 hours ~nan Eppendoti tube. The digest with the enzyme pyroglutamats aminopeptidase from calf liver (Boehringer) was cartied out on 5 nanomolss of the carboxvmethvlatsd amicyanin usiig i n EIS ratio of 118 in the presence of 5% glycerol. l o m" EDTA.5 mM afhiothreitol and 100 mM phosphate buner, pH 8 (all final concentrations). The incubation was petiormed duting 20 haus at 30% under argon. Pepride #m-olt;on: The peptldes were separated by reverse-phase hlgh pertormance liquid chmmatooranhv on a Vvdac214TP54 Column of 4.6~250mrn (Vydac, Hespeila, USA) in i chromatograpliic set-up consislmg of i DiP&l three headed piston plmp, a 8800 DuPonl Gradlent Controller, a DuPont vanable wavelength detector set at 220 nm. and a 7120 Rheodvne inledron valve with a loop of io0 &I. Solvent A was 0.1% IrilluomacetK: acid'ln Milit-0 water. solvent 6 contatned 0.1% trifluoroamlic acid in 100% acefon~ttile.HPLC grade (Fisher, USA). The course of the gradient is drawn on the Fios 2-4 Pmlide fractions were collened manually in- polypropylene tubes of 1x1; cm (Kaneil. BRD). i t i Speed Vac Concentrator (Savant, USA), and stored at -18-C They were redissolved in 0.1% acelic acid ptior to sequence orland amino acid analysis.

4875

Ammo acid analvsrs: The eartiest samples of protein and peptides were analyzed by Ion-exchange liquid chromatography and apos1.coIumn nlnhydnn readton usmg a Mullichrom 4255 Analyser (Beckman, USA). The precedmg hydrolysis of the samples was carned out with hydrochloric acid by using equal volumes of concentrated AnstarHCI(BDH, UK) and Mill, 0 waterSupplementedwlth phenol and mercaptoethanal. The hydrolyses were petiormed In v m o and aHsr 24 hours of hydrolysis at 106'C the acid was removed in an exS#CCatorover sodium hydroxide pellels. In a later stage of the work we also Used foe precolumn derlvatization method with phenyl-isothiocyanate. The detivatlzation was camied out on a 420A Derlvafizer (Applied Biosystems Inc., USA) with Snparation of the phenylcolumn of 2.1~220mm (Applied Biosyslems thmarbamyl denvalives onaPTC USA) In a 130A Separatlon System (Applied Blosyslems). The peptldes analyzed i i this waywere hydrolyzed in thegas phase by uslng the same acidtc solutm as menlioned above. The Precolumn method was also used to analyse the amino acids released by carboxypeptidase (see funher). Seouence analvsls. Automated sequence analysis was carned out in the eartier stages of the work in a 470A gas-phase sequenator (Awlled Biosystems) wrul onlhne analysis of the PTH-amino aclds on an IBM cyanopropyl column of 4.6~252mm (Biotech, UK). The samples obtalned from the swuencer Were dned and methylated according lo 1121 in order to bnng the PTHdenvativas of Asp and Glu out of the lnlfial polar region of the chromatograms.The chromatographic installation Consisted of two Waten 6000A pumps. a 720 Wafers System Controller and a fixed wavelength Waters 440 detector set at bath 254 and 313 nm. InjBCfionS were done automaticelly In a Waters 7108 WISP sample inpdor.

Homology of Thiobacillus Amicyanin to Plastocyanin

4876

J2erminaI SBPvencB: The evidence for the Ian that the polypeptide chain 01 amicyanin is wt longer than the 1 0 6 rewues given in Fig. 1 is, Bntly. thatthe Edman degradation 01 lhe peptiles R2 and H7 did nM allow 10 detect an amino acid beyond the glytamic add residue. and semndly. that the amino acid wmposition 01 the cslcuiated sequenca fds the pmpod very well (Table I).A third argument is lhe release 01 theamino up lo Met99. when lhe amicyanin is incubated with a minure 01 carboxypeplidases-Y and -P BL an EIS ratio 01 1/50 lor each enzyme. Thequantitative results are shown in Fig. S.I. An experiment wino Only carboxypepfidase9 at pH 5.5 and an Us ratio of11100 was not suocashtl as if released the amino adds only slowly after at lean 1 hour 01 incubation.

RESULTS S e w e m analvsis of native and ca xvmahhvlaredDmlein:Theamicyanin muld na degradation even not with a sample load of 5 narmmoles.The N n wasrepeated on the carbxymethylated protein starting lmm 1 nanomole. but also here no sequence information a u l d be obtained. Sincewe assumed thatthefailure of theEdman degradation wasdue to the presence of pyrcglutamic acid the at lint position. treated the protein wilh pymglutamaleaminopeptidase.Theexperimentswere camed out a1 a stage 01 the work when the sequence of mosl of the peptides obtained from the dinerent dgesls was already determined (see hereafter).Aner 30 hours 01 incubationwiththe enzyme the amicyaningave a very clearcutsequenceStartingwithAsp(Table S.1) so that we mndude that the N-terminal sequence was pymGlu-Asp-. The sequence N n was slopped deliberately at cycle 24 as a( that moment we noted a 5 residues ovedap with peptide H11.

adds.

Table S.I.

Amino add 01 lhe am-amiclnin: Although the amino acid mmposition of the pmtein was published eartier 131. we repealed the analyses On the sample used tor lhe presem sequence determination.Thedata in TaMe I were recalculated aner the sequence determinationwascompleted. The value 01 cysteicacidfrom the oxidizedSample was understandard. and the peak of thecarbaxymethyl.cysteine was even 5 timessmaller than muid be expected from thepresence of 1 cysteine residue. The sequence determination revealed unambiguously. howewr. the presence of 1 cysteineresidue at position 93 of thepolypeptidechain.Theother posible mpper-binding residues His and Met were presem for respectively 2 and 4 units. Furthermore,thecomposition revealed thal cleavage of the protein should yield only a limited number of peptides when Staphylocoavs aureusprotease.ArgC protease and panial acid hydralysis would be used. Pentides fmm the S. aureos dimst Themajor peak. 56 (Flg. 2) was the first one we sequenced.Althoughtheaminoacid mmpsition 01 the peptide suggested a total of 28 or 29residues(TaMe S.11). the sequenceanalysis(Table 5.111) Showed S6 lo be 30 residues long with a tryptophan residue. the only one in the protein, at position 11. The off-line PTH-analysis on the cyanopropyl mlumn allowed an unambiguous identification 01 thePTH-Trp.gwen its isolated elutiontime.The peptideapparentlyshowed 2 GI"-Xbcnds whch were not cleaved, one where X-Thr and the second where X=Vai. Annher major pepide in the HPLC separation, S9. could not be sequenced andwastheretore ansidered to be the blocked Nterminalpeptide.PeptideS9wassubdigestedwithChymotrfpsin in a later Stage 01 the work. TheHPLCseparationresulted in two lradlons (Fig. 5 ) 01 whichthefirst One (S9C1) mnlained a hexapeplidewiththe sequence MelTyrLeuThrProGiu.The second One muld not be sequenced.Thehexapeptide musl havearisenfromthe bcnd. It should be noted that peptide S9 also Met29-Lys30 cleavage 01 the mntatned t w o uncleaved GIu-X b o n d s , X being Lys in both cases. The first impartantpeak in theseparationchromatogram. S1. appeared to Contain a peptide 01 14 residues. which could be sequenced up 10 the last (glutamlc acid) resdue. At cyde 15 therewas still a furtherincrease in theamount 01 PTHGlu. but we have experienced many tmes a carwover 01 thls restdue when it is in C-terminal position. The peptide in the unresolved doublet S3lS4 (Fig. 2 ) was found to havethesame N-terminal sequence as pepide S6: thesequence wasfollowed not further than residue 10 as ai thatpoint it becameclearthat. as in the not been cleaved by the prmease. The amino acid GIu42.Thr43 bond had Wmpositlon revealed that 53 had very likelythe Same sequence as 56 although some Values wereslightly different lmm the values following from this proposal. Peptide 54. apan from mmaining peptide 53 a8 expeaed, also wntained the fragment Thr43-Glu65,which must haveoriginatedfrom a minor deavage of the Glu42-Thr43 bond. For obvious reasons no amino acid analysis was carried out on 54. Peptide 55 wasthe only one whichshowed a definlle carbaxymethyiqsteine peak in theaminoacidanalysis. By lack 01 a reference produd we wuld not precisely quantify the amount 01 this residue, but the peak area was nearly 314th 01 that 01 aspartic acid which eluted just after the CM-Cys. We therefore assumed the presence of one suchrestdue in peptide 55. Thisassumptionwas confirmed upon sequenceanalysiswherethe PTHderivative 01 methylatedCM-Cyswasfoundat position 8. We confirmed theidentity by repeatingthePTH-analysisunderSlightly alkaline conditions asdescribed in [ls]. TheEdmandegradation 01 S5 was lairiy poor, however. and din not proceed well beyond step 10. probably due l o the carryover stanmg at Sen of the peptide. The only other S. aureus peptide needed lo lit the proposed sequence, wvering theregionAsp66-GIu71,wasfound as the semnd peptide In the miawe 52. Also here it was found that the GIu-X bond had not been cleaved. wlth X-Me1 this lime. Themajor wmponent 01 S2 had thesame initial sequence as peptideS6and probably mvered the region Va136-Giu50. as no PTH-Val or -Met muld be detected anymoreattheEdman cycles 16 and 17. A muchpurerpeptide staning off with rsidue Asp66, appeared to be S8. The sequence run was not cominued after cycle 3 bBcause ai that moment peptides fmm another digest (see hereanor) had already establishedthis region of the sequence. Peptide S7 appeared lo be pure upon Edmandegradation.withan initial sequence identica 10 that of S5. Also.theamlno acid mmposilion revealed this idemlly although some values Such as lor Pm,Ala,His and Argwereunder different elution standard. We noticed hardly any CM-Cys,whichsuggeststhatthe time 01 57 and S5 during HPLC separation was mamly due 10 a different degree of modification of Cys-8 of thepeplides. We also foundthe Same initial sequence as for S5 uponsequencingthe lirst of thedoubletpeaks eluted between 52 and 53 (results not shown).Curiously,the second peptldecontainedin this peakhad the sequenceValAlaPheLys (results not shown).which must havearisen from an AsnValcleavageat position 55. The yield of this peptide a u l d not have been higher than 200 piwmoles. however, assuming an Initial sequence yield of 30%.

56.

-

Psufldes lmm lhe Am-C Dmlease does. Akhough there are only 2 arginine more than 10 major peptides upon cleavage 01 the calboxymelhylaled protein with the Arg-C protease (Fig. 3). Most of these peptides originated lrom deavage 01 Lys-X ban&. withX=Lys(for R4), Tyr (for R6) or Ala (lor R7. REA.R9. R10 and R12). The l a m live peptides have the same N-terminal sequence (Table S.IV). and amino acid analysis (TaMe S.V) revealed that certainly R7. R8 and R12 ShoUM be equal in lenglh, with Lys-59 as Cterminus.The two 'regular'peptides originating 1mm Arg-X bond deavage were R15A and R2. eachwithX=Gly. R15A yielded anovertap between pepides S1 and S5. and R2 appeared later lo be the C-terminal R.peplide 01 theprotein. We also noticed the remadable lealure of deavage al the N-terminal side of bass residues: e.g.. peptide R2A originated from a Met-Arg. peptide R5 from a Met-Lys and pepide R1 from a Gly-Lys bond cleavage(quantitative results not shown). Even more sumrisinowas thal Denide R15B slaned on atSera due lo a cleavaae lmm the Precedini threonine 'rekdue. Anyway.the leck of SpBCifiCity 01 the ArgYC protease was in a certain way benellcial for the sequence determination as n provided amongst othen. an overlap between the peptides 66 and S8. Peptide R14, because of lack 01 Edman degradation. is very likely the N-terminal R-peptide, but the amino acid analyds was not goodenough 10 saywhether t h e peptide 1s endingatLys59 or at Arg69. ~ m l v s i sThe peptide : contained in the major peak H l l of theHPLCSeparationchromatogram(Fig. 4) was apparently also the longest Hpeptide 01 the protein and started on a( Ala21 (TaMe S.VI). In spite 01 its purity the aminoacid mmposition Fable S.ll) does nol allow to mndude dearly thatthe peptide mntained either 46 residues as would be expected tmm the proposal in Fip. 1. or only 35 in whih case the C-terminal residue would have been a deamidated lorm 01 Asn-55. The peak Semnd inheight. H7. appeared 10 be the C-terminal Hpeptidewithunambiguous idemillcation of the carbaxymethylfysteine at cycle 3. Also here.insphe 01 the u n e q u i v d sequence. theamino aad mmpsition was substandard for Phe.Glu.Alaand A s x . and espedally for Gly.Other H-peptides were not analyzed lor mmposhion. Sequenceanalysis revealed thatpeptidesH6 and HB had the same N-terminal sequence as H11. andthat peptide H9 had the same s a q u e m as the C-terminal peptide H7. The dillerence in elulion time of the lanerpeptides durinp theHPLC p u n l i o n must have been due lo a different degree 01 modilicafion of the w e i n e residue. We apparentlymissedthepeptide wvenng the s e q u e m Gln7TAsp90. n is possible that this peptide was contained in H4 as no sequence information was oblained u p n analysis of even all ofthe material. a resun to be expected it the N-terminal glutamine had been cyclized 10 pymgluramic add.

Increases of amounts 01PTH-aminoadds are in piwmoles.

cycle 1 2 3

D K

Cycle 13 14 15 16 17

493 1211 1494

I

A

D P

118 10

A D A

18

v

29 64

V

19 20 11 12

75

.

12 10

281 98

A

Table S.11.

.

.

k not Values for CM-Cy5in S5 and H7 were

s3

S1

Amino acid

S6

S5

quantified (see tea). s9

57

H7

H11

CMC ASP M R SER GLU PRO GLY ALA VAL MET ILE LEU

2.2 3.1 0.3

5.0 2.1 5.3 3.1 7.6 0.9

2.5

1 .o 1.4

N R

1.6 4.2 1 .o

PHE HIS LYS

mp

..

ARG

0.2 ~~

1.8

5

5.3

11.2

Y91-El06

A21-D66

1.1

Amount (nmoies) Told , Yield 4.3 (nmoles) Sequence M72-E85 posillon

2.2

6.67.3

V36-E65

Affi-El06

25.8

21.7

V36-E65

A86-El06 pEt-E35

Tam S ill

sl

2 3 4

5 5

7 8 9

53

s2

A Amount

SI

B

A

M 678

v 353

V1194 T 87

M 74

49 I 174 K128 A 1e4 G 155 E 43 T 32 v 47 L177

T

T

T 118 K 194 D

a

54

59 A 144 Y 101 A 129

10 11

I111 T 37

I2

F 75

13

N

14 15



46 5

Y 14

w .

.

s5

I

s

0.9

3.2

113

03

A 267 G 201 9

D K 92

956

A 27

V 166

70

L SO2 T 85 P eo

1.) .I

.

I 503 Y se K292 W140

A 1102 G 1381

S Y

24 99

D

11

GIs5 N57

Y

E 29 21

c

65 59

A301

VI54

G 73 E 29

F

T P H

+

v 186

26 Z

W

v 2 N 10

17 1s

19 20 21 22 23 24 25 26 27 28 29 30

Y 197 26 173

V N G E V M P H 17N

34 88

39 50 47 25 42 29

K 0 I V 0 E V3B?

V56?

V3S-E65

V3e-7

TU?

ASS-E1OS

75 148 135 53 21

F

K

M72-EB5

149 169

v

A

31

SW

V 752 T 92 I 543 K 1525 A 965 0 292 E 210 T 127

10

16

DOSifiO"

7

a

m

B 314

0.8

113

1.5

10 4

e

1S

E

15

Homology of Thiobacillus Plastocyanin Amicyanin to

4877 Table S VI

Table S.IV 1

l e d bv Ara-C or01 Legend as lor Table S.111. The doubly underlined p

!ii?

Amounl Cycle

114

135

A

B

1I 4

4.0

3 V 3696 232 5539 VK 45124 V I 2 5 v 2211 I 4389 22 3040 E6 V174E 7 E 333 6 6395

I248

-

9

10

The doubly underhned wplldes are shown in Rg. 1

I 7296 VI0050 941 G E 2386 D 2165 A F 7429 R 180

H8

2.3

6.0

K1072 Y 278 L1453 236 T P 277 157 E V1212 T 120 I 258

R 363 6893 K217 G

2943 G 1 5143 G 252 3992 K 4685 GK 2

y

B E S E E

R4 -

R3

mi e

Y 422 L 2849 T 119 P 195 E 120 250 V T 9 I 45 35 K

Amount Cycle

A 315 G 158 E 31 T 21 V 159 II(, 8 5

1 2 3 4 5

6 7

Seq. G101-El06 R100-El06 G101-El06 K59-R69 poslion

a

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

R8

R9

R10

R12

R14

2.7

54

45

5.7

A 340 G 242 E 36 T 26 v 110 Y 53 W 6

A 3051 &I746

A 1497 706 G E 139 123 T V 1537 Y 1001 146

M 44 e 4 8

G188 E 17 M 17 M 39 31 T K 16 D 7

w

V N

. .

10

17 9

0

S 36

GG 22 65 28

7

A 1V 61 3 Y ? &27 A 4 I 10

-

E 6 M 15 M 4 T23 K 23 D 13 0 ? A 8 Y14 26 A l d T 2 F 13 N 8 E 4 A 17 G 26

K 10

P 50

E

72

K

161

9

P

10 11 12

38 73

63 A

v

60

. -

Y 139 F3118

H9

1/1

1/2

1/33

Y 124 F 62

A 536 V 579 V 134

A432 V116 V565 T 150 T 55 V P 269 58G I 55 E 15 23 K 25

+

-

13 289 P G20

-

188

-

A24 A40

-

G70-? A40-K60 A40-K59

A40-K59

r

V

-

G49-? A21-9 Y91-EI06 A21-? Y91-El06 A21-D66

seq posltlon

V 8 P 45 D 3 23 A

6

I 14 E 2 .

pEl-7 G70-Rl00

SO-?

25

Table S V.

Log TIME lmlnl

Ides lrom T, v e r s u q . lion of Aro-C 1-n Values ~nparentheses are deducedfrom the sequence. A dash indicates values less than 0.2. The amounts rnenlloned underneath the fable represenl thenumber of nanomoles used for the amlno-acldanalvsIs R5

R6

R7

E9

0.3 (0)

- (0) 1.7 (2) 0.3 (0)

2.0 (2) 1.0 (1)

0.7 0.2 0.2 1.0

- io1

1.0 2.1 2.1 3.8 0.2

(1) (2) (2) (4)

2 0 (2)

0.6 ( I ) 0.9 (1) 0.8 (1) - (0) - (0) 1 1 (1)

1 2 (2) 0.9 (1) 0.4 (0) 1 . 3 (2) 0.5 (1) 1.3 (2) 1.0 (2) 4.5 (4) 0.3 ill 0.3 (0)

1.0 1.0 0.8 1.1

(1) (1) (1) (1)

17

26

4.0

Total 40.6 (nmol)

11.9

18.0

Ammo acid (1)1.0 ASP THR SER GLU PRO GLY ALA VAL MET ILE LEU TYR PHE HIS LYS

TRP ARG Amount (nmol)

R4

.

(0)

0 3 (0) 1.0 (1) (0) 1.7 (2) 0.9 (1) 0.7 ( 1 )

.

101

06

(1)

.

(0) ill ,. 101

- (0)

0.9

1.0

ili

- (0) 0.6 (1) -

SequenceK59-R69 posillon

1.6 (2) 0.3 (0)

0.6 (1)

0.5 (0) 0 2 (0) 1.0 (1)

- io1

0.9 (1) 0.9 (1) 0.9 (1) - (0)

- (0)

(1) (0) (0) (1)

~~~~

K30-K39 Y31-K39 A40-K59

ill

- (Oj - (0)

-

~

0.3 (0) 1.3 1.3

(1) ill

R10

R12 1.6 09

0.6

- (0) - (0) 1.0 (1) 1 0 ill

1.9 0.9 2.1 1.8 3.4 0.5 0.2 02 0.6 0.7

(2) (1) (0) (2) (1) (2) (2) (4) (11 (0)

(0) (1) ill

R14 5.2 (5) 4.6 (5) (1) 2.0 6.3 (6) 4 0 (4) 42 (3) (7) 6.4 9.7 (11) 0.6 121 4.1 (3) 1.4 (1) 1 2 (2) 0 5 ill

~

1.6

1.9

6.1 432

57 A40-KSO

3.6

A40-K59

1.5

17

17.9

20.0

A40-K59

pEl-(R69)

I 679 E K 271 M 726 K 264 Y 260 L 481 97T P 166 E 58 v 200 47T I 200 90K 116 A 73G E 24 19 T 52V Y 22 11

&

26

V39

-

A40-K59

-

89

H6

14 15 16 17 18 19 20 21 22 23 24 25

mi

Y 11 .

Seq. poation

(E

3.3

13

213

A258 62 G E 3 43 T V106 Y 60

H5 112

170 AG K1621 EV 54 I 2521 171 T C 27 V105V M 94 V1558 P 40 T 125

A4559 F 8867 R 132 G1795 E 949 2382

6

11

Amounl Cycle

213 1 I6

112

I

I

04 -05

15

05 Log TIME lmfnl

Flgure S.I. Release of amino aclds upon incubation 01 1.5 nanomoles of carboxymelhylaled amicyanin With a mixture offhe carboxypeptidases-P and -Y. Samples 01 One lenlh the initial reanion mlnure were subjected 10 amlno acid analysls (see Methods).