Free thiol groups of cysteine .... free sulfhydryl group was detected by Ellman's reagent even ..... Junge, W. (1984) in Methods in enzymatic andysis, 3rd cdn.
Eur. J. Biochem. 162,485-491 (1987) 0FEBS 1987
Soybean hydrophobic protein Isolation, partial characterization and the complete primary structure Shoji ODANI ',Takehiko KOIDE
',Teruo O N 0 ',Yasuhiko SETO' and Tatsuo TANAKA
Department of Biochemistry, Niigata University School of Medicine, Niigata The Peptide Institute (the Protein Research Foundation), Minoh, Osaka Department of Biochemistry, Yamagata University School of Medicine, Yamagata
(Received August 22/0ctober 21, 1986) - EJB 86 095'1
A 9000-M, protein isolated from a 60% ethanolic extract of soybean (Glycine max) seeds has been characterized and fully sequenced. The protein consists of 80 amino acid residues with four disulfide bonds. It contains a large number of hydrophobic residues and lacks methionine, phenylalanine, tryptophan, lysine and histidine residues. The protein readily crystallizes from water but is quite soluble in aqueous organic solvents like 95% 1-propanol. It aggregates to form large molecules (above 80 kDa) under ordinary denaturing conditions, such as 6 M guanidine . HC1 and 8 M urea. Sequence analysis showed that the amino-terminal four-fifths is extremely hydrophobic and most of the acidic residues exist as their amide forms, and only the carboxyl-terminal short segment is rather hydrophilic. A computer search for homology detected an unexpected similarity of this protein to rat prolactin; however, its significance could not be assessed and this protein appears to represent a hitherto unknown protein family. Although no biochemical activity could be detected, the existence in relatively high abundance (approx. 200 mg from 1 kg seeds) of this novel protein may suggest its physiological significance in the plant. Plant seeds contain a large number of proteins of known and unknown physiological function. Some of them are very hydrophobic and can be extracted by aqueous organic solvents. An extreme example of such hydrophobic proteins is crambin, which is extracted from seeds of Crambe abyssinicu (Abyssinian cabbage, a relative of mustard) by 80% aqueous acetone and crystallizes upon evaporation of the solvent 111. Recently its complete primary structure  and three-dimensional structure at 0.15 nm resolution  have been reported. Another recent example is a group of cereal endosperm proteins extracted by chloroform/methanol [4, 51. During the course of examination of proteinase inhibitors in soybean (Glycine max) seeds we found a similar hydrophobic, easily crystallizable protein in a 60% ethanolic extract of the ground beans. Although any biological activity of this protein could not be assessed, the protein has an unusual amino acid composition and interesting physicochemical properties and may serve well as a model of hydrophobic proteins. Therefore, we determined its amino acid sequence, which may be a prerequisite for understanding its properties and also, we hope, its biological functions. This report describes the isolation, partial characterization and complete primary structure of the soybean hydrophobic protein. MATERIALS AND METHODS Muter ials Soybean seeds were finely ground and stored at -20°C. DEAE-cellulose and carboxymethyl (CM)-cellulose (DE-32 Correspondence to S . Odani, Department of Biochemistry, Niigata University School of Medicine, Asahimachi 1, Niigata, Japan 951 Enzymes. cc-Chymotrypsin (EC 3.4.21.I); trypsin (EC 220.127.116.11); elastase (EC 18.104.22.168); papain (EC 22.214.171.124); triacylglycerol lipase (EC 126.96.36.199); @-amylase(EC 188.8.131.52).
and CM-32) were products of Whatman Chemical Separation Ltd (Kent, UK). Reagents for Edman degradation were sequence-analysis-grade preparations of Wako Pure Chemicals (Osaka, Japan). Solvents for high-performance liquid chromatography (HPLC) were products of Kanto Chemical Corp. (Tokyo, Japan). Bovine trypsin, elastase and porcine pancreatic a-amylase were purchased from Cooper Medical (Freefold, USA). a-Amylase from Aspergillus oryzae was a kind gift of Dr Sumihiro Hase (Osaka University). Extraction 1 kg finely ground soybeans were extracted with 3 1 60% ethanol at room temperature for 15 h, then soybean flour was removed by filtration. The filtrate was brought to pH 4 with acetic acid and 6 1 cold acetone (-20°C) were added to it. This mixture was kept at 4°C overnight. The supernatant was removed by decantation and the sticky precipitate formed was collected, dissolved in 200 ml water and dialyzed against 10 mM sodium acetate, pH 4.0, at 4°C.
CM-cellulose chromatography The dialyzed solution described above was applied to a column (2.5 x 40 cm) equilibrated with 10 mM sodium acetate buffer, pH 4.0, and eluted with a linear NaCl gradient to 0.5 M (total volume, 1600 ml). Fractions containing proteins were pooled and dialysed against 0.05 M ammonium acetate buffer pH 6.5. DEAE-cellulose chromatography The dialysed CM-cellulose fractions were applied to a column of DEAE-cellulose (1 x 25 em) equilibrated with
486 0.05 M ammonium acetate, pH 6.5, and eluted with the same buffer. Fractions passed through the column were pooled.
Crystallization of the hydrophobic protein The pass-through fraction from DEAE-cellulosewas kept days' Crysta1s were at room temperature for collected, washed three times with water and lyophilized.
Reverse-phase chromatography Crystalline protein was further separated into two components by reverse-phase high-performance liquid chromatography on a Merck RP-8 (octylsilica) column 0.4 x 25 cm) in 10 mM NH4HC03, pH 7.0, using a linear gradient of acetonitrile concentration from 1% to 70%. Peptides were separated by a Toyo Soda ODS 120T (octadecylsilica) column (0.45 x 25 cm) using a similar acetonitrile gradient in 0.1o/' trifluoroacetic acid. A Hitachi 638-30 liquid chromatograph was used throughout.
Amino acid sequence analysis The amino acid sequence of protein and peptide samples (20 - 300 nmol) was determined by a Jeol JAS 47K sequenator [I 11 using 0.1 M N,N,N',N'-tetrakis(2-hydroxypropyl)ethylenediamine buffer [121 and 1 ,4-dimethyl-l,5-diaza-undecamethylene polymethobromide (Polybrene) . Phenylthiohydantoin derivatives were identified by a Hitachi 655 liquid chromatograph equipped with a Rainin octadecylsilica column (0.46 x 25 cm, Rainin Instrument, Woburn, USA). Isocratic elution with 42% acetonitrile/58% sodium acetate buffer (10 mM, pH 4.5) was used. Phenylthiohydantoin derivatives were quantified by a Hitachi 655-60 data processor calibrated with a Pierce phenylthiohydantoin amino acid kit (Pierce, Rockford, Illinois, USA).
Chemical modifcation of disulfide bonds Disulfide bonds were cleaved by performic acid oxidation , or by reduction with dithiothreitol followed by carbamoylmethylation with iodoacetamide .
Analytical gel filtration A TSK G 3000SW gel permeation chromatography column (Toyo Soda, Tokyo, Japan) was equilibrated with 6 M guanidine . HC1 in 10 mM phosphate buffer, pH 6.5 (1 mM EDTA), connected to a Hitachi 638 liquid chromatograph and eluted with the same buffer at a flow rate of 0.5 ml/ rnin according to the method of Ui . Protein samples were incubated in 5% 2-mercaptoethanol in 6 M guanidine/HCl (pH 6.5) at 100°C for 5 min or more before application to the column. Elution was monitored by absorbance at 280 nm. In some instances elution was conducted in 0.05 M phosphate buffer, pH 7.0, containing 0.1 M sodium citrate or in 0.1OO/ trifluoroacetic acid containing 35% acetonitrile . The column was calibrated by reduced and S-carbamoylmethylated standard proteins. Blue dextran 2000 (Pharmacia, Uppsala, Sweden) and dinitrophenyl-alanine were used to calculate Vo (void volume) and V, (total available volume) of the column respectively. (kfr)0-555 was plotted against (&)"3, where Kd (distribution coefficient) = ( V - Vo)/(V,- Vo) and V = elution volume of a protein .
DEAE-Sephadex A-25 chromatography Peptides generated by enzymatic hydrolysis of the soybean hydrophobic proteins were separated on a DEAE-Sephadex A-25 column (0.25 x 50 cm) equilibrated with 20 mM NH4HC03, pH 8.0. A linear gradient of NH4HC03 concentration from 20 mM to 800 mM (total volume, 500 ml) was used for elution. The column was then washed with 1.0 M NH4HC03. Peptides were detected by absorbance at 230 nm.
Amino acid analysis Peptide and protein samples (10 - 50 nmol) were hydrolysed with 5.7 M HCl for 22 h at 110°C under vacuum, and amino acids were determined by a Jeol JLC 6AH amino acid analyzer. Methionine and cystine were determined after performic acid oxidation . Free thiol groups of cysteine residues were determined by Ellman's reagent, 5,5'-dithiobis(2-nitrobenzoic acid), in 6 M guanidine/HCl . Tryptophan was quantified by the amino acid analyzer after hydrolysis with 3 M mercaptoethanesulfonic acid [lo].
Enzyme assays Inhibitory activities against serine proteinases were assayed as described earlier . Inhibitory activity against papain was determined by the method described by Arnon [I 61 using N-benzoyl-L-arginine-p-nitroanilideas the substrate. Inhibitory activity to a-amylase was determined using coloured, insolubilised starch as the substrate . Triacylglycerol lipase activity was assayed titrimetrically using crude porcine pancreatic lipase (type 11, Sigma Chem. Corp., St Louis, MO, USA) and tributyrin as the substrate . Inhibition of L-[ 14C]leucineincorporation into proteins in a rabbit reticulocyte lysate system was examined by Dr Takahashi of our department by the procedure described before . Haemagglutinating and haemolytic activities were tested for rabbit peripheral erythrocytes according to Osawa and Matsumoto .
RESULTS AND DISCUSSION
Purification and crystallization of soybean hydrophobic protein CM-cellulose chromatography of the acetone precipitate of the soybean extract gave a single major protein peak. This peak, which was mainly due to Bowman-Birk proteinase inhibitor , contained the hydrophobic protein. The latter was separated from the inhibitor by DEAE-cellulose chromatography at pH 6.5 (figure not shown). When the passthrough fraction from the DEAE-cellulose column was kept for 2 days at room temperature, large crystals appeared. These were longish rods or elongated plates about 1 mm in length (Fig. 1). The mother liquid still contained the protein, which could be further crystallized upon standing. These crystals were combined, washed with water and lyophilized. The overall yield of the protein was approximately 200 mg from 1 kg soybean flour. The protein gave a single band by disc electrophoresis at pH 9.5 and 4.3 [21, 221, but was further separated into two components by reverse-phase chromatography on alkylsilica gel as described later.
Fig. 1. Crystals of soybean hydrophobic protein. The large plates are approximately 1 mm long. Crystallized from water at pH around 7 3
c m 0)
n L v) 0
10 15 20 25 E l u t i o n Volume (ml) Fig. 2. Gel,fiflration of soybean hydrophobic protein in the presence of 6 MguanidinelHCl. A TSK G3000 SW column (Toyo Soda) for highperformance liquid chromatography was equilibrated and developed with 10 mM phosphate buffer (pH 6.5) containing 6 M guanidine. HCI and 1 mM EDTA . (a) Reduced and S-carbamoylmethylated standard proteins. 1, blue dextran 2000; 2, rat serum albumin ( M , = 65000); 3, ovalbumin (43000); 4, bovine carbonic anhydrase (30000); 5, soybean trypsin inhibitor (20 100); 6, lysozyme (14000); 7, soybean inhibitor D-II(8580); 8, insulin B chain (3420); 9, bovine insulin A chain (2380); 10, dinitrophenyl-alanine. (b) Soybean hydrophobic protein treated with 5% 2-mercaptoethanol in 6 M guanidine . HCI (pH 6.5) at 100°C for 20 min. (c) The soybean protein treated under the same conditions as (b) for 5 min
Solubility of the hydrophobic protein
The crystalline protein was sparingly soluble in aqueous buffers of pH above 4, but soluble in dilute acids such as 0.1 M formic acid. The acid-solubilized protein could be kept in solution for a day after neutralization, but then crystals appeared. It was quite soluble in 95% 1-propanol or ethanol. Estimation of the relative molecular mass
The M , of the protein was estimated by sodium dodecyl sulfate/polyacrylamide gel electrophoresis to be 5600, but this
Fig. 3 . Estimation cf M, of soybean hydrophobic protein. Results of the gel filtration in Fig. 2 were analysed by plotting (M,)0.555against the cube root of distribution coefficient ( K d ) of the proteins . Numbers in the figure correspond to those in Fig. 2. The plot of soybean hydrophobic protein is shown by an arrow. M , of the protein calculated by this plot is 8800
value was incompatible with the result of amino acid analysis, giving non-integer residue numbers/mol protein. Therefore, gel filtration in 6 M guanidine . HCl, described by Ui , was performed (Fig. 2). As shown in Fig. 3, a good linear relationship between M10.555 and for standard proteins was obtained. However, the hydrophobic protein incubated under the standard conditions (6 M guanidine . HCl, 5% mercaptoethanol, 5 min at 100'C) prior to application, was eluted near the void volume of the column (Fig. 2). When the protein was heated for more than 20 min under the same conditions it was eluted at a reasonable position representing an M , of 8800 (Figs 2 and 3). Similar results were obtained when 8 M urea was used for the denaturant (data not given in the text). The hydrophobic protein was eluted a little ahead of soybean Bowman-Birk proteinase inhibitor ( M , = 7850) under the non-denaturing conditions (50 mM phosphate buf-
488 Table 1. The amino acid composition of the two forms of soybean hydrophobic protein separated by reverse-phase high-performance liquid chromatography Numbers in parentheses indicate the nearest integers. The composition of crambin  is also listed for comparison
T i m e (min) Fig. 4. Separation of the two components of soybean hydrophobicprotein by reverse-phase chromatography. A Merck RP-8 (octylsilica) column (0.4 x 25 cm) was equilibrated with 10 mM NH4HC03 (pH 7.0, 1% CH3CN) and eluted with a 40-min linear CH3CN gradient from 1% to 70%. The flow rate was 1 ml/min. The two components were designated as form I and form I1 in the order of elution E
ASP Thr Ser Glu Pro GlY Ala CYS Val Met Ile Leu TYr Phe TrP LYS His Arg Total
10.2 (10) 3.5 (4) 5.8 (6) 3.0 (3) 4.0 (4) 7.6 (8) 4.1 (4) 7.5 (8) 1.9 (2) 0.0 8.7 (9) 14.0 (14)
10.4 (10) 3.5 (4) 5.7 (6) 3.2 (3) 4.1 (4) 8.0 (8) 4.9 ( 5 ) 7.1 (8) 2.0 (2) 0.0 8.7 (9) 15.0 (15) 0.9 (1)
4 6 2-3" 1 4-5" 4
0.7 (1) 0.0 0.0 0.0 0.0
0.0 0.0 0.0 5.0 ( 5 )
5.0 ( 5 )
5 6 2 0 4-5" 1-2" 2 1 0 0 0 2
T u b e Number Fig. 5. DEAE-Sephadex A-25 chromatography of the tryptic digest of reduced and S-carboxymethylated soybean hydrophobic protein. A 10-mg digest was applied to a column (0.9 x 50 cm) equilibrated with 20 mM NH4HC03 (pH 8.0). The column was eluted with a linear gradient of NH4HC03 from 20mM to 800mM (total volume, 500 ml), and then with 1.O M NH4HC03.The flow rate was 1 ml/min. Peptides indicated by bars were pooled separately and lyophilized
fer, pH 7.0,O.l M sodium citrate). The protein emerged at the void volume when 35% acetonitrile in 0.1% trifluoroacetic acid  was used as an eluant (data not shown). These results may indicate that this protein has an M , of around 9000 but tends to aggregate to form larger molecule(s) under the ordinary denaturing conditions, but the driving force of this phenomenon was unclear. Unlike the usual behaviour of proteins, somewhat drastic conditions including complete reduction of disulfide bonds seem to be necessary to dissociate the aggregates. Separation of two forms by reverse-phase chromatography
Amino-terminal sequence analysis of the native protein suggested the presence of two forms. These were separated from each other by reverse-phase high-performance liquid chromatography using a Merck RP-8 octylsilica gel column at neutral pH. The crystalline preparation was eluted as two peaks, form I and form I1 (Fig. 4), which were pooled separately and analyzed for their amino acid compositions and amino-terminal sequences. As shown in Table 1, the
These values include sites of microheterogeneity .
difference between their amino acid compositions is the lack of one residue each of alanine and leucine in form I. The amino-terminal sequences were determined as Ile-Thr-ArgPro-Cys(03H)-Pro-Asp-Leu- for form I and as Ala-Leu-IleThr-Arg-Pro- for form 11, indicating that form I is a derivative of form I1 lacking the amino-terminal Ala-Leu dipeptide sequence. This may be due to some proteolytic activity in the soybean seeds, since we noted the presence of two forms of a soybean trypsin inhibitor whose difference is the absence of an amino-terminal nine-residue segment in one form . It is of interest to note that proteins different only by two residues at their amino termini could be completely separated by the reverse-phase chromatography. The overall amino acid composition of the purified hydrophobic protein is remarkable for its high content of hydrophobic residues and the absence of lysine, histidine, phenylalanine, tryptophan and methionine, which is a similar composition to crambin , though the molecular mass of crambin is about one-half that of the soybean protein. No free sulfhydryl group was detected by Ellman's reagent even after the prolonged incubation under the denaturating conditions. This indicates that the eight half-cystine residues form disulfide bonds in the molecule. Production, separation and sequence of tryptic peptides
Reduced and S-carboxymethylated protein (form 11, 20mg) was suspended in 3 m l 1 0 m M NH4HC03 and digested with trypsin (0.05 mg) for 15 h at 25"C, and the resulting peptides were separated by DEAE-Sephadex A-25 using a gradient of NH4HC03 concentration (Fig. 5). Five major peptides, T1 to T5, were obtained which accounted for the total composition of the protein (Table 2). A minor peptide, designated as T3 4, was apparently derived by incomplete hydrolysis of the peptide bond between T3 and T4. The peak appeared to be too large for its peptide content quantitatively determined by amino acid analysis. It appeared
T 1 30
T 1 50
2 1 1
T 3 - 7 T 4 - k T 5 - 1 T 3+4 E 1 -
Fig. 6. Summary of proofs of soybean hydrophobic protein sequence. Numbers under the amino acid residues are yields (nmol) of phenylthiohydantoin derivatives by the analysis of 120 nmol oxidized hydrophobic protein. T, tryptic peptides; E, elastase peptides; (-) unidentified residue; (c) determined by hydrazinolysis Table 2. Amino acid composition of tryptic peptides of reduced, and Scarboxymethylated soybean hydrophobic protein (form I I ) Numbers in parentheses are those deduced from the amino acid sequence. Cys(Cm), S-carboxymethylcysteine Aminoacid
Table 3. The amino acid sequence of tryptic peptides of soybean hydrophobic protein Numbers under the peptide names indicate the amounts (nmol) used for sequence analysis, and those under the amino acid residues are recovery (nmol) of phenylthiohydantoin derivatives
Thr Ser Glu Pro GlY Ala Val Ile Leu TYr A% Total Yield (YO)
5.8 (6) 5.2 ( 5 ) 1.9 (2) 2.0 (2)
2.8 (3) 1.9 (2) 2.0 (2) 5.9 (6)
1.0 (1) 1.0 (1)
1.0 (1) 1.7 (2) 0.2 1.0 (1) 0.9 (1)
1.0 (1) 1.1 (1) l.O(l) 1.5 (2) 0.2 2.3 (2)
1.8 (2) 2.8 (3) l.O(l)
2.6(3) 0.1 1.0 (1) 1.7 (2) 7.2 (7) 1.1 (1) 1.0 (1) 9.1 (9) 3.1 (3) 2.5 (3) 0.1 1.1 (1) 2.0 (2) 0.8 (1) 1.0 (1) 1.1 (1) 9
Ala-Leu-Ile-Thr-Arg-Pro-Ser-Cys-Pro-Asp-Leu-Ser-lle260 210 110 84 84 42 60 54 6 3 42 60 58 45 57 -Cys-Leu-Asn-Ile-Leu-Gly-Gly-Ser-Leu-Gly-Thr-Val-Asp53 51 49 26 32 25 23 23 24 20 20 1 1 10 -Asp~Cys,Cys,Ala,Leu,Ile,Gly,Gl~,Leu,Gly,Asp,IIe,Glu,
1.8 (2) 1.4 (1) 1.0 (1)
1.4(1) 2.1 (3) 0.9 (1)
2.8 (3) 1.0 (1) 2.0 (2) 1
that some ultraviolet-absorbing substance, probably derived from the reagents, coeluted at the same position. An irregular shape of the peak also suggested this. T5 yielded only free threonine before and after acid hydrolysis. The sequences of the tryptic peptides were determined by the sequenator (Table 3). The largest peptide, T1, could be sequenced up to the 27th residue, and the remaining part was deduced from the sequence of the whole protein (Fig. 6). Alignment of trypt ic pep t ides and the complete primary slructure
In order to align the five tryptic peptides, form I1 was oxidized by performic acid and analyzed for the amino-terminal sequence. As shown in Fig. 6, 65 residues from the amino terminus could be identified with the exception of serine residues at positions 7, 12 and 22, which were already identified by analysis of a tryptic peptide, T1, of this region. The carboxyl-terminal residue of the protein was identified by hydrazinolysis , which yielded 22 nmol threonine (uncorrected) from 71 nmol protein after 6 h at 100°C.
Amino acid sequence
T2 Ala-Leu-Gly-Ile-Leu-Asn-Leu-Asn-Arg 370 260 2 0 0 160 230 200 170 180 140 80 T3 Asn-Leu-Gln-Leu-Ile-Leu-Asn-Ser-Cys-Gly-Arg 310 170 220 160 170 150 110 99 78 66 50 30 T4 Ser-Tyr-Pro-Ser-Asn-Ala-Thr-Cys-Pro-Arg 380 260 270 170 140 87 120 75 72 20 16 T5
At this stage arrangement of the five tryptic peptides in the parent molecule could be unequivocally determined as follows. The result of the sequenator analysis on the whole protein (residues 1 through 65) covered T1, T2 and the aminoterminal six residues of T3 in this order. T5 (free threonine) was placed at the carboxyl terminus of the protein from the specificity of trypsin and the result of hydrazinolysis. There remained only one tryptic peptide, T4, and eventually this was positioned between T3 and T5. The presence of a minor tryptic peptide T3 + T4 supported the alignment of T3 -T4. To further confirm the above alignment, carboxymethylated protein (8 mg) was digested with 0.1 mg porcine elastase in 1 mlO.1 M NH4HC03 for 5 h at 25°C. The elastase peptides were separated by a DEAE-Sephadex A-25 column in a manner
T-+ s+ -H--s-*-*-c-T-
Fig. 7. Secondary structure prediction and hydrophathy profile for soybean hydrophobic protein. Regions of predicted secondary structure  are indicated by pairs of arrows. S, sheet; T, turn; H, helix. The other parts are predicted to be random structure. The hydropathy profile (shown at the upper side of the sequence) was calculated using the parameters of Kyte and Doolittle with span set at seven residues 1
F r o 1a c t i n
Fig. 8. Comparison of primary structure of soybean hydrophobic protein and rat prolactin. Two gaps were allowed for each sequence for maximum similarity. Identical residue are boxed with solid lines similar to the tryptic peptides and further purified by reversephase high-performance liquid chromatography on an octadecylsilica column (ODS 120T, Toyo Soda, Tokyo). Of more than 15 peptides thus obtained (data not shown) two peptides, E l and E2, were useful for the alignment of the tryptic peptides. The composition of E l was Cys(Cm) 0.9, Asp 0.9, Ser 2.7, Pro 1.1, Gly 1.0, Leu 1.0, Tyr 0.8 and Arg 1.0. This corresponded to the region from the 6th residue of T3 to the 4th residue of T4. The second one, E2, whose composition was Cys(Cm) 0.8, Thr 1.8, Pro 1.2, and Arg 1.O, overlapped T4 and T5. The complete primary structure of soybean hydrophobic protein (form II), deduced from the above results, is shown in Fig. 6. The amino acid compositions of the other elastase peptides also confirmed this sequence (data not given). There is a potential N-glycosylation site near the carboxyl terminus, -Am-Ala-Thr- (position 74 - 76), but we found no indication of sugar attachment to the asparagine residue.
Homology with other proteins
A computer search  for sequences similar to soybean hydrophobic protein was made against a data base of the Protein Research Foundation (Minoh, Osaka), where nearly all of the published sequences have been accumulated. However, no protein, including crambin, was found to exhibit convincing homology to this protein. The only notable similarity was quite an unexpected one: a similarity to rat prolactin. As shown in Fig. 8, residues 66 through 144 of rat prolactin  are considerably similar to the whole sequence of the soybean protein assuming only two gaps each for the proteins (23 identical residues out of 80 residues compared). The matching probability of this alignment calculated by the program ALIGN  was and we are inclined to put this soybean protein into an entirely new protein family until this similarity is established as a true homology by further sequence analysis of the related proteins from different species.
Secondary structure and hydrophathy profile
The amino acid sequence of soybean hydrophobic protein was interpreted by computer methods for secondary-structure prediction  and for hydropathy profile . The results are shown in Fig. 7. The predicted secondary structure consists mainly of 6, sheets and fl turns and contains only a short a helix. This is very different from the structure of another plant hydrophobic protein, crambin, in which 46% is in helices and 17% is in a j3 sheet . The hydropathy profile (Fig. 7) indicates that the molecule is largely hydrophobic, as expected from the amino acid composition, and has a short hydrophilic segment near the carboxyl-terminal end. The GRAVY (grand average of hydropathy) score, which is an index of overall hydrophobicity , is 0.899, a value close to those calculated for many membrane proteins of similar size .
Soybean hydrophobic protein showed no significant inhibitory activity to bovine trypsin, bovine a-chymotrypsin, porcine elastase, papain, and &-amylasesfrom porcine pancreas, barely (Hordeum vulgare) malt and Aspergillus oryzae. No haemagglutinating activity or haemolytic activity was observed for rabbit peripheral erythrocytes. A slight inhibition of porcine pancreatic triacylglycerol lipase by this protein (substrate, tributyrin) (data not given) appears to be similar to that reported for 'hydrophobic proteins' such as serum albumin, P-lactoglobulin, mellitin, ovalbumin and myoglobin, which compete for the oillwater interface of the emulsified triglyceride substrate [29,30]. Incorporation of ~-['~C]leucine into the protein fraction in the rabbit reticulocyte cell-free system was inhibited by this pro-
49 1 tein as by many other plant seed extracts , but the results were not quantitatively reproducible. Therefore, the biochemical or biological activity of this soybean hydrophobic protein remains unclear. Recently a low-M, seed protein was isolated from pea (Pisum sativum) and sequenced . However, it bore no sequence resemblance to the soybean hydrophobic protein studied here nor could its function be identified. We have as yet not found its counterpart in soybeans, but it appears that legume seeds contain a number of low-M, proteins of unknown function. Further investigations are required to understand the biological function of this soybean hydrophobic protein, which must be closely related to its unusual structure and properties. We thank Prof. Tsuneo Fujita (Niigata University) for his encouragement throughout this work. We also thank Dr Yoshiaki Takahashi (Niigatd University) for the assay of protein synthesis in rabbit reticulocyte lysates. This work was supported in part by a grant from the Miura Memorial Foundation to S.O.
REFERENCES 1. Van Etten, C. H., Nelson, H. C. & Peters, J. E. (1965) Phyiochemistry 4, 467 -473. 2. Teeter, N. M., Mazer, J. A. & L‘ltalien, J. J. (1981) Biochemistry 20, 5437- 5443. 3. Hendrickson, W. A. & Teeter, N. M. (1981) Nature (Lond.) 290, 107-113. 4. Garcia-Olmedo, F. & Garcia-Faure, R. (1969) Lebensm. Wiss. Technol. 2, 94- 96. 5. Shewry, P. R., Lafiandra, D., Salcedo, G., Aragnoncillo, C., Garcia-Olmedo, F., Lew, E. J.-L., Dietler, M. D. & Kasarda, D. D. (1984) FEBS Lett. 175, 359-363. 6. Ui, N . (1979) Anal. Biochem. 97,65-71. 7. Swergold, G. D. & Rubin, C. S. (1983) Anal. Biochem. 131,295300. 8. Hirs, C. H. W. (1967) Methods Enzymol. 11, 59-62. 9. Ellman, G. L. (1959) Arch. Biochem. Biophys. 82,70-77. 10. Moore, S. (1972) in Chemistry and biology ojpetides (Meienhofer, J., ed.) pp. 629-653, Ann Abor Publishers, Michigan.
11. Edman, P. & Begg, G. (1967) Eur. J. Biochem. I , 80-91. 12. Brauer, A. W., Margolies, M. N. & Harber, E. (1975) Biochemistry 14, 3029 - 3035. 13. Tarr, G. E., Beecher, J. F., Bell, M. & McKean, D. J. (1978) Anal. Bioehem. 84,622- 627. 14. Hirs, C . H. W. (1967) Methods Enzymol. 11, 199-203. 15. Odani, S. & Ikenaka, T. (1977) J . Biochem. (Tokyo) 82, 15131522. 16. Arnon, R. (1970) Methods Enzjmol. 19, 226- 252. 17. Wahlefeld, A. W. (1984) in Methods in enzymatic analysis, 3rd edn (Bergmeyer, H. U., Bergmeyer, J., Grassl, M. & Moss, D. W., eds) vol. 4, pp. 161-167, Verlag Chemie, Weinheim. 18. Junge, W. (1984) in Methods in enzymatic andysis, 3rd cdn (Bergmeyer, H. U., Bergmeyer, J., Grassl, M. & Moss, D. W., eds) vol. 4, pp. 15 -25, Verlag Chemie, Weinheim. 19. Aoyagi, Y., Takahashi, Y., Odani, S., Ogata, K., Ono, T. & Ichida, F. (1982) J. Biol. Chem. 257,9566-9569. 20. Osawa, T. & Matsumoto, I. (1972) Methods Enzymol. 28, 323328. 21. Davis, B. J. (1964) Ann. New York Acad. Sci. 121, 404-442. 22. Reisfeld, R. A.: Lewis, U. J. & Williams, D. J. (1962) Nature (Lond.) 195, 281 -283. 23. Narita, K., Matsuo, H. & Nakajima, T. (1975) in Protein sequence determination (Needleman, S. B., ed.) pp. 30- 103, SpringerVerlag, Berlin. 24. Chou, P. Y. & Fassman, G. D. (1978) A h . Enzymol. 47, 145148. 25. Kyte, J. & Doolittle, R. F. (1982) J. Mol. Biol. 157, 1 0 - 132. 26. Waterman, M. S., Smith, T. F. & Beyer, W. A. (1976) Adv. Math. 20,367 - 387. 27. Cooke, N. E., Coit, D., Weiner, R. I., Baxter, J. D. & Martial, J. A. (1980) J . Biol. Chem. 255,6502-6510. 28. Dayhoff, M. O., Barker, W. C. & Hunt, T. L. (1983) Methods Enzymol. 91, 524- 545. 29. Borgstroem, B. & Erlanson, C. (1978) Gastroenterology 75,382386. 30. Gargouri, Y., Julien, R., Sugihara, A., Verger, R. & Sarda, L. (1984) Biochim. Biophys. Aeta 795, 326-331. 31. Gasperi-Campani, A,, Barbieri, L., Lorenzioni, E. & Stirpe, F. (1977) FEBS Lt’it. 76, 173 -176. 32. Gatehause, J. A,, Gilroy, J., Hoque, M. S. & Croy, R. R. D. (1985) Biochem. J . 225,239-247.