Sequencing, Cloning, and Expression of Human Red Cell-type Acid ...

3 downloads 46 Views 5MB Size Report
May 25, 2016 - Cloning and Expression of Red Cell-type Phosphatase. 10857 suggested that human red cell acid phosphatase is expressed in many tissues ...
T H EJOURNAL OF BIOLOGICAL CHEMISTRY 0 1992 by The American Society for Biochemistry and Molecular Biology, Inc.

Vol. 267,No. 15,Issue of May 25,pp. 10856-10865,1992 Printed in U.S.A.

Sequencing, Cloning, and Expression of Human Red Cell-type Acid Phosphatase, a Cytoplasmic Phosphotyrosyl Protein Phosphatase” (Received for publication, October 7, 1991)

Yu-Yuan P. WO$, Ashley L. McCormackQ, Jeffrey ShabanowitzQ, Donald F. HuntQ, June P. Davis$, Greta L. Mitchell$, and Robert L. Van Etten$ll From the tDeDartment of Chemistry, Purdue University, West Lafayette, Indiana 47907 and the $Department of Chemistry, University of Virginia, Charlottesuiik, Virginia 22901

Low molecular weight phosphotyrosyl protein phosThe phosphorylation of proteins is an important and funphatases of human placenta and human red cell were damental mechanism in regulating diverse cellular processes, purified and sequenced by a combination of Edman including cell growth, proliferation, and transformation. The degradation andtandem mass spectrometry. Screening level of protein phosphorylation is usually modulated by two of a human placental Xgtl 1 cDNA library yielded over- sets of enzymes: protein kinases and protein phosphatases. lapping cDNA clones coding for two distinct human Many protein kinases, including numerous protein tyrosyl cytoplasmic low molecular weight phosphotyrosyl pro- kinases, have been extensively studied and are well charactein phosphatases (HCPTPs). The two longest clones, designated HCPTP1-1 and HCPTPZ- 1, were found to terized (1). In contrast, the elucidation of the nature and have identicalnucleotide sequences, with theexception regulation of protein tyrosine phosphatases (PTPases)’ has of a 108-base pair segment in the middle of the open begun only recently, most notably with the study of important integral membrane and nonreceptor PTPases, including reading frame. Polymerase chain reaction studies with human genomic DNA suggest that the difference be- PTPase 1B and T cell PTPase (2). In addition to those tween HCPTPl-1 andHCPTPZ-1 does not result from enzymes, it also appears likely that some intracellular phosalternative RNA splicing. Studies with a human chro- photyrosyl protein dephosphorylation reactions are catalyzed mosome 2-specific library confirmed that these se- by another group of PTPases, namely the low molecular quences are located on chromosome2, which is known weight cytoplasmic phosphotyrosyl protein phosphatases. to be the location of red cell acid phosphatase locus These enzymes have molecular weights of approximately ACPl. The coding sequences of HCPTPl- 1 and 18,000, and were first isolated as acid phosphatases from a HCPTPP-1 were placed downstream from a bacterio- wide variety of mammalian tissues including bovine and huphage T7 promoter and the proteins were expressed in man liver (3,4) and human placenta(5). Several of them have Escherichia coli. The resulting recombinant enzymes already been characterized as phosphotyrosyl protein phos(designated HCPTP-A and HCPTP-B, respectively) phatases. Thus, the low molecular weight acid phosphatase showed molecular weights of 18,000 by sodium dode- from bovine heart exhibits a strong activity towards phosphocy1 sulfate-polyacrylamide gel electrophoresis,and tyrosyl proteins, but little or no activity towards phosphoseryl both of them exhibited immunoreactivity with antisera (6,7). The enzyme isolated from or phosphothreonyl proteins raised against authentic human placental and bovine heart enzymes. The expressed proteins were highly humanplacenta (5) was shown to bevery active in the dephosphorylation of a number of phosphotyrosyl peptides activetowardsthe phosphatase substratesp-nitrophenyl phosphate, &naphthyl phosphate, and O-phos- and proteins, including angiotensin, tyrosine kinase P40, and pho-L-tyrosine, but not a-naphthyl phosphate, threo- erythrocyte band 3 protein. A similar specific activity toward nine phosphate, or 0-phospho-L-serine. HCPTP-A and phosphotyrosyl proteinshas also been reported for a low -B possessed effectively identical amino acid composi- molecular weight phosphotyrosyl proteinphosphatase isotions, immunoreactivities, inhibition by formaldehyde, lated from human erythrocyte (8). and kinetic propertieswhen compared with twohuman Because this group of enzymes has such a remarkably broad red cell acid phosphatase isoenzymes. It is concluded distribution in nature, it seems likely that these enzymes are that HCPTP-A and -B are the fast andslow forms of involved in common and crucial dephosphorylation pathways red cell acid phosphatase, respectively, and that this that serve to control the stateof phosphorylation of proteins, enzyme is not unique to the red cell but is instead and their consequent physiological functions. For example, expressed in all human tissues. the regulation of red cell metabolism appears to be controlled by the state of phosphorylation of tyrosine residues of the cytoplasmic domain of red cell band 3 (9). Significantly, low molecular weight cytoplasmic phosphatases are present in * This research was supported by Department of Health and Hu- various types of cells besides the red blood cell. Early reports man Sciences National Institute of General Medical Sciences Grants GM 27003 (to R. L. V. E.), GM 37537 (to D. F. H.), and core facilities from AIDS Center Grant AI 27713. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solelyto indicate this fact. The nucleotide sequences HCPTPl and HCPTP2 reported in this paper have been submitted to the GenBanklEMBL Data Bank with accession numbers M83653 and M83654, respectively. T To whom correspondence should be addressed.

The abbreviations used are: PTPases, protein tyrosine phosphatases; HCPTPs, human cytoplasmic phosphotyrosyl protein phosphatases; PCR, polymerase chain reaction; BHPTP4, cDNA clone for bovine heart low molecular weight protein tyrosyl phosphatase; HCPTP-1 and HCPTP-2, cDNA clones of human cytoplasmic phosphotyrosyl proteinphosphatase; HCPTP-Aand -B, recombinant human cytoplasmic phosphotyrosyl protein phosphatases; SDS, SOdium dodecyl sulfate; PAGE, polyacrylamide gel electrophoresis; HPLC, high performance liquid chromatography; bp, base pair.

10856

Cloning and Expression of Red Cell-type Phosphatase

10857

suggested that human red cell acid phosphatase is expressed hydrochloride prior to lyophilization of the HPLC solvent. The erythrocyte enzyme was isolated from whole blood following in many tissues andcell lines, including placenta (lo), kidney, lysis with a solution of 5 mM NaH2PO4, 1 mM EDTA, 50pg/ml liver, and leukocyte (11).In humans, red cell acid phosphatase PMSF, pH 8.0. Hemolysate was collected, adjusted to pH 7.5, and is an important genetic marker (12) and is encoded by the applied to a DEAE-Sepharose column equilibrated with buffer A (10 ACPl locus on chromosome 2 (13). The enzyme is genetically mM Na2HP04,0.2 mM dithiothreitol, 0.1 mM phenylmethanesulfonyl polymorphic (14). The major ACPl alleles are each expressed fluoride, 1 mM EDTA, pH 7.5, adjusted with phosphoric acid). After in the form of two electrophoretically distinct isoenzymes, extensive washing, protein was eluted with a gradient of0-0.75 M termed "fast" and "slow" in order of their migration to the NaCl in buffer A. Fractions with phosphatase activity were collected subjected to (NH4)B04 precipitation at 60% saturation and anode (15), but (until the present studies)the structuralbasis and allowed to standovernight at 4 "C. Precipitated protein was collected of this difference was unknown. by centrifugation, resuspended in 2 M (NH&S04, and applied to a Although the kinetics and mechanism of the bovine heart phenyl-Sepharose HIC column. Protein was eluted with a decreasing gradient down to pure water. After dialysis and concentration of enzyme have been extensively characterized (16) and the cDNA for that enzyme has been cloned,' very little is known phosphatase peaks, protein was applied to a G-50 size exclusion about the human low molecular weight protein tyrosine phos- column in 10 mM acetate, 1mM EDTA, 0.2 mM dithiothreitol, 30 mM NaH2P04, pH 4.8, and eluted with the same buffer. SDS-PAGE phatases with respect to their intracellular substrates, phys- analysis of the eluted protein showed a single band. iological activity, three-dimensional structure, or genetic regProteolytic digestions on either native or reduced and S-carboxyulation. The amino acid sequences of these enzymes from amidated protein (0.5-2.0 nmol) were performed with trypsin, endobovine liver (18) and bovine heart' suggest that this class of Glu-C, endo-Asp-N, endo-Lys-C, a-chymotrypsin, and elastase (150, enzyme shares no homology with the much larger PTPase 1B enzyme:protein, w/w) at 37 "C for 5-36 h in 80 mM Tris-HC1 buffer or T cell PTPase enzymes, althoughthis conclusion was containing 1 M guanidine hydrochloride. Peptides were fractionated by reverse phase HPLC as described above. Experiments on the uncertain because the sequence of the small human enzyme enzyme isolated from red cells were carried out in a similar way, was unknown. It is equally uncertain whether the enzymes of except that the pH of the protein solution was adjusted to 8.5 with other tissueshave similar or identical structures and functionsTris-HC1 buffer, digestion was allowed to proceed for 24 h a t room as compared to human erythrocyte acid phosphatases. The temperature, and theproducts of the digestion were analyzed directly lack of either protein primary structure or DNA sequence by microcapillary HPLC/electrospray ionization/tandem mass specinformation presented major impediments to solving these trometry as described below. Chemical cleavage of the protein samples cyanogen bromide was performed by adding an excess of CNBr problems. Here, we describe the primary structure determi- with to purified protein (450 pmol) in 70% trifluoroacetic acid (100 pl). nation,as well asthe cDNA cloning of two human low The reaction was allowed to proceed for 18 h in the dark, at room molecular weight cytoplasmic phosphotyrosyl protein phos- temperature, under argon. Automated Edman degradation was perphatases (HCPTPs). Using a T7 polymerase expression sys- formed directly on an aliquot of the lyophilized cyanogen bromide tem, we show how both enzymes can be overexpressed in fragments. Mass spectra were recorded on either a triple quadrupole (FinniEscherichia coli. The resulting recombinant enzymes are charTSQ-70) or a quadrupole Fourier transform instrument. acterized kinetically and immunologically.The genomic DNA gan-MAT Sample ionization/desorption from a matrix of thioglycerol on the structure is examined in part, and thechromosomal location latter instrument was accomplished with a cesium ion gun (Antek) of the gene coding for these HCPTPs is also examined. The as described (21). Both cesium ion bombardment (22) and electroresults indicate the identity of HCPTPs and human red cell spray ionization (23) were employed on the triple quadrupole instruacid phosphatases, they provide a structural explanation for ment. Samples for analysis by the electrospray ionization technique the difference between the major slow and fast isoenzymes of were injected intothe instrument directly from a microcapillary HPLC column or from a microliter syringe. For injection by syringe, red cell acid phosphatase, and they show that the red cell- samples were dissolved in 5% acetic acid and thentreated with type enzyme is in fact expressed in all major human organs. methanol to give a peptide concentration of10-15 pmol/pl in 1:l EXPERIMENTALPROCEDURES

Materials-A human placental cDNA library constructed in Xgtll and a human multiple tissue Northern blot were purchased from Clontech Laboratories, Inc. Thermus aquaticus (Taq) polymerase (19) for the polymerase chain reaction (PCR) (20), and human placental genomic DNA were obtained from Promega. A mixture of DNA polymerase I and DNase I for nick translation labeling of DNA was obtained from Bethesda Research Laboratories. Human chromosome 2-specific genomic DNA libraries in bacteriophage Charon 21A were obtained from the American Type CultureCollection, Rockville, MD. Oligolabeling kits were from Pharmacia LKB Biotechnology Inc., while [a-35S]dATP and [cY-~'P]~CTP were obtained from Amersham Corp. PET vectors were from Novagen. Other sources were as described? Sequencing of the Human Placentaland Red Cell Enzymes by Mass Spectrometry-The low molecular weight phosphatase from human placenta was purified either by a combination of ammonium sulfate precipitation and Sephadex column chromatography, or as described previously (7). Further purification of the sample was accomplished by reverse phase HPLC on an Applied Biosystems Model130A Separations System with a narrow bore RP-300 Aquapore column (2.1 X 30 mm). Samples were eluted with a 40-min linear gradient of 0-60% (v/v) acetonitrile (0.085% trifluoroacetic acid) in 0.1% trifluoroacetic acid. Column effluent was monitored at 214 nm and fractions were collected by hand. Fractionscontainingprotein were treated with 10 p1 of 0.5 M Tris-HC1 buffer containing 6 M guanidine Wo, Y.-Y. P., Zhou, M.-M., Stevis, P., Davis, J. P., Zhang, 2.-Y., and Van Etten, R. L. (1992) Biochemistry 31,1712-1721.

methanol, 0.5% acetic acid. Sample flow rate through the syringe was 0.6 pl/min. Microcapillary HPLC columns were constructed from fused silica (70 cm X 70 pm inner diameter) and contained 10-pm particles coated with C-18 absorbent packed into the last 10 cm of the column (23). Samples for sequence analysis were dissolved in 0.5% acetic acid (10-20 pmol/pl), injected on column, and eluted directly from the column into the electrospray ion source with a 10min gradient of 0-80% acetonitrile in acetic acid (0.5%)at a flow rate of 1-2 pl/min. Molecular mass measurements were obtained by scanning the range from m/z 300-1400 repetitively every 1.5 s. To obtain amino acid sequence information, collision-activated dissociation spectra were acquired by scanning the mass range from m/z 50-1800 repetitively every 2 s. Argon (3 mtorr)was employed as the collision gas. Collision energies of 20-25 and 15-20 eV were used to dissociate (M + 'H)'+ and (M + 3H)3+ions, respectively. Procedures for the acetylation and esterification of peptide samples were as described (23). Sequence Analysis by Automated Edman Degradation-Automated Edman degradation was performed by standard methods on an Applied Biosystems, Model 473, protein sequenator. Screening of the cDNA Library-The probe used for screening of the cDNA library was a 580-bp EcoRIISalI DNA fragment containing the entire coding region of a cDNA clone for bovine heart, low molecular weight protein tyrosyl phosphatase, i.e. BHPTP4.' The isolated fragment was labeled with [a-32P]dCTP toa specific activity of about lo* cpm/pg by the random-priming method (24). Approximately 300,000 plaques of a human placental cDNA library in h g t l l were screened using a standard protocol (25). Hybridizations were performed at 40 "C for 16 h, and stringent washes were carried out in 2 X s s c , 0.1% SDS at 60 "C for 10 min (1 X SSC is 0.15 M NaCl plus 0.015 M sodium citrate, pH 7.)

10858

Cloning and Expressionof Red Cell-type Phosphatase

Sequencingoflsolated Clones-cDNA inserts from each clone were Inhibition of HCPTP-A and HCPTP-B by Formaldehyde-Enzysubcloned into pUC18. Followingalkali denaturation, DNA sequences matic activity was measured in the presence and absence of 13 mM of both strands were determined by the dideoxy chain termination formaldehyde. The 3.0-ml reaction mixture contained 10 mM p method (26) using Sequenase (U. S. Biochemical). In order to use nitrophenyl phosphate, 67 mM sodium citrate buffer, pH 5.75. The sequencing primers more efficiently, partial deletion subclones were reaction was stopped after 30 min by transferring a 1.0-ml aliquot to constructed by utilizing the digestion-religation strategy andthe 5.0 ml of 0.2 M NaOH and the absorbance of p-nitrophenoxide was unique restriction sites (e.g. PstI and S a d ) located both in the insert measured at 415 nm. DNA and multiple cloning site of pUC18. Genomic DNA Studies-In order to test for the possible presence Expression and Purification of Recombinant HCPTPs-The T7 of introns in the DNA coding for HCPTPs, PCR amplifications were polymerase bacterial expression system (27, 28) with the expression performed with human genomic DNA from various sources. In addivector PET-lld was adopted for overproducing the recombinant tion to human placental genomic DNAand whole blood,Hind111 and proteins. The 540-bp DNA fragment containing the entire coding EcoRI chromosome 2-specific genomic DNA libraries were also utiregion of HCPTP1-1 or HCPTP2-1 was amplified by a polymerase lized. Several pairs of primers were employed. One set, i.e. primer 3: chain method using the corresponding recombinant plasmid as a 5'-GTCCTGCATTTCTCAGTCG-3',and primer 4: 5'-CTAtemplate, together with a pair of oligonucleotide primers, i.e. primer GACTGTGAGCTCTACG-3', was selected to span a 440-bp segment 1: 5'-GGACCATGGCGGAACAGGCTACCAAGTCCG-3' and at the 3' noncoding regions of HCPTP clones. PCR was performed Primer 1 provided for 27 cycles. Following an initial denaturation at 94 "C for 5 min, primer 2: 5'-GAAATGCAGGATCCTCAGG-3'. an NcoI restriction site and contained an initiation codon (ATG) as the amplification was carried out with denaturation at 94 "C for 1 well as the first 7 amino acid codons, while primer 2 included a min, annealing at 50 'C for 2 min, and polymerization at 72 "C for 3 BamHI restriction site 50 bp downstream from the termination codon min. The PCR products were analyzed by agarose gel electrophoresis, (TGA). PCR was performed on an Ericomp Single Block System and directly sequenced in the presence of 0.4% Nonidet P-40 without instrument. Denaturation was carried out at 94 "C for 1min, anneal- subcloning (33). When genomic DNA in whole blood was used as a ing at 45"C for 2 min, and polymerization at 72 "C for 4 min. template, the blood samples were pretreated. The reaction mixture Altogether, 27 cycles were performed, with the last polymerization included 1 pl of thawed blood that was denatured at 94 "C for 3 min step lasting for 10 min. The amplified fragment was digested with and cooled at 55 "C for 3 min (34). This procedure was repeated two NcoI and BamHI, cloned into NcoIIBamHI-digestedpET-lldvector, more times before adding Taq polymerase (2.5 units) and carrying and transformed intoE. coli strainDH5a competent cells. The out the amplification. correctness of the expression vector constructs was confirmed by In order to obtain genomic sequences extending into the coding restriction mapping and DNA sequencing. For subsequent expression region, two synthetic oligonucleotides, i.e. primer 5: 5"CTTCCGGstudies, the expression vectors were transformed into competent GTATGAGATAGG-3' and primer 6: 5"CCAAGAGCTGTGAGCHMS174(DE3) cells. TGCC-3', were used in conjunction with primer 4 for amplification. The expression of HCPTPs were conducted in M9ZB medium Primers 5 and 6 are sequences present in HCPTP1-1 and HCPTP2(containing 1%casein hydrolysate, 0.5% of NaC1, 0.1% NH4Cl, 0.3% 1, respectively. Again, the chromosome 2 genomic libraries and also KH2P04,0.6% Na2HP04,0.4% glucose, and 0.001 M MgS04) in the human placental genomic DNA were used, but 5% formamide was presence of50 pg/ml of ampicillin. For small scale preparations, a added to these reactions in order to enhance the specificity (35). single colony of E. coli strain HMS174(DE3), carrying the plasmid Slightly modified cycling parameters were utilized to achieve a higher with the HCPTP insert, was grown with good aeration in 1 ml of yield, i.e. initial denaturation at 90 "C for 5 min, followed by 25 cycles medium at 37 "C to an ASM)of 0.6-1. The culture was then induced of denaturation at 90 "C for 1min, annealing a t 45 "C for 2 min, and Cells extension a t 69 "C for 2 min. The resulting DNA fragments were for 3 h with 0.4 mM isopropyl-1-thio-P-D-galactopyranoside. were subsequently harvested by centrifugation at 3000 X g for 10 min, digested with PstI, subcloned into pUC18, and used for doubleand the pellet was frozen at -20 "C for 30 min. After resuspension in stranded sequencing. Similar amplifications were also performed 0.2mlof a solution containing 20 mM Tris-HC1, pH 8.0, 20 mM using various combinations of other primers whose sequences were EDTA, and 10 pl of 10 mg/ml lysozyme, the cells were lysed by four derived from the cDNA sequences of HCPTP clones. Northern (RNA) Blot Analysis-The BamHI/NcoI DNA fragment cycles of freezing-thawing in liquid nitrogen and water bath, respectively. To these cell lysates were added 10 p1 of 0.2 M MgC12 and 5 p1 of the vectors corresponding to thecoding regions of HCPTP1-1 and of DNase I (5 mg/ml), and the solutions were incubated at 25 "C for HCPTP2-1 were labeled by a nick translation method and used separately as probes in RNA blot hybridizations. The membrane was 20 min. After addition of20plof0.2 M EDTA and 26 pl of 10% Triton X-100, the suspensions were incubated for an additional 10 prehybridized for 5 h at 42 "C in a solution containing 5 X SSPE (1 min. Crude extracts (20 pl) were then subjected to SDS-PAGE and X SSPE is 0.15 M sodium chloride, 10 mM sodium phosphate, and 1 mM EDTA, pH 7.4),10 X Denhardt's solution (1 X Denhardt's immunoblotting analysis. For further purification and characterization, 500-ml cultures were solution is 0.02% polyvinylpyrolidone,0.02% Ficoll, and 0.02% bovine grown. Isopropyl-1-thio-P-D-galactopyranoside-inducedcells were serum albumin), denatured salmon sperm DNA(100 pg/ml), 50% centrifuged at 3,000 X g for 10 min and resuspended in 10 ml of buffer deionized formamide, and 2% SDS. The hybridization was carried containing 10 mM sodium acetate, pH 4.8, 30 mM phosphate, 1 mM out in the same solution containing 2 X lo6 cpm/ml of radiolabeled EDTA, and 60 mM NaCl. The E. coli cells were lysed using a French DNA probe for 18-24 h at 42 "C. After hybridization, the blot was press. After centrifugation at 10,000 X g for 5 min, the supernatant rinsed first with a solution of 2 X SSC, 0.05% SDS several times at was directly loaded onto a Sephadex G-50 sizeexclusion column (105 room temperature and then with 0.1 X SSC, 0.1% SDS for stringent washing at 50 "Cfor 1h. The autoradiogram was obtained by exposing X 2 cm), which was equilibrated with the same buffer. Fractions with substantial phosphatase activity were pooled, concentrated, and ana- the blot to Kodak XAR-5 film at -70'C using two intensifying screens. Prior to hybridization using another probe, the blot was lyzed by reverse phase HPLC. SDS-PAGE and Immunoblotting-Crude extracts were analyzed boiled in sterile H 2 0 for 20 min to remove the DNA probe. by SDS-PAGE (29) using a 15% gel with Coomassie Brilliant Blue R-250 staining. Immunoblot studies of recombinant proteins (30) RESULTS were performed using antisera prepared against human placenta and bovine heart low molecular weight phosphatases (5, 7). Rabbit imAminoAcid Sequence Analysis of the Human Placental munoglobin was then detected with peroxidase-conjugated goat anti- Enzyme-Shown in Fig. 1 is the complete amino acid serabbit IgG and 125 ml of a solution containing 100 plof 30% hydrogen quence of the humanplacental enzyme as deducedby a peroxide plus 60 mg of 4-chloro-1-naphthol. combination of automated Edman degradation and tandem Enzymatic Activity of the Recombinant HCPTPs-Phosphatase activity was assayed as described earlier (5). Using p-nitrophenyl mass spectrometry. Extensive sets of overlapping peptides phosphate as a substrate, the activity assay was also conducted at were obtained throughout. 30 "C in a buffer containing 0.2 M sodium citrate, pH 6.0, 1 mM A collision-activated dissociation mass spectra recorded on dithiothreitol, and 1%bovine serum albumin. A unit is defined as the the (M H)' ion ( m / z 689) from one of the tryptic peptides amount of enzyme needed to produce 1 pmol of p-nitrophenol/min. afforded the sequence -EQATK. The mass spectrum specified The assay temperature was either 30 or 37 "C, as specified. Protein concentration was determined by the Lowry assay (31). The catalytic the mass of the unidentified residue as 114 Da, a number activities of the expressed enzymes towards synthetic substrateswere consistent with either Ile/Leu or N-acetyl-Ala. To differentiate these possibilities, the sample was acetylated and then determined using an inorganic phosphate assay (32).

+

Cloning and Expression of Red Cell-type Phosphatase 10 20 30 40 A c - A E Q A T K S V X F V C X G N X C R S P X A E A Y F R K X V T D Q N X S E N U V \9-68*/ \--132+/ \U-113-/

\.. .

\1-98-/

\ . . . . "E.E1-134"/

.-E,El-l93+

\.

/ 0 6 1 . \

-\

\.......T.=.2776...... E-161-/

60 70 80 50 X D S G A V S D U N V C R S P D P R A V S C L R N H G X H T A H K A R Q X T K E / -\ . . .+.K-.-Edmar"../ \.. *,K. ----C-1040/ \-E.E1-897-/ /"4101"\ \.. . ." ........................ \--123/ \9-741-/ \. . - . . . .---'-643+++"/ \

90

100

110

120

D F A T F D Y X X C ~ D E S N L R D L N R K S N Q V K T C K A K I E L X G S Y D /

'

~

-142+/ ' --903mb + + +

-/

\ 6

2

9

""""""~.~.2~*3"---------/

-

C

. /D 1.. - .- . \-E.E1-1115"/

. .... . . ..

n-~dman .

\ \--1661\..

-.

140 150 130 P Q K Q X X X E D P Y Y C N D S D F E T V Y Q Q C V R C C R A F L E K A H / \ D.K-Ednaz. \-€,€1-744/ +.1081/ -/ \"""""""""T . ~ 2897."---------......./ \-E.E1-971-/ . . .-/ -.E-974/ \-E-663wL

FIG. 1. Amino acid sequence of authentic placental tissue enzyme as determined by a combination of tandem mass spectrometry and Edman degradation. Peptides arelabeled according to the method used to generate them and the technique used to sequence them. Proteolytic and chemical cleavage methods are designated as follows: K , endoprotease Lys-C; T, trypsin; C, a-chymotrypsin; D,endoprotease Asp-N; E , endoprotease Glu-C; El, elastase; M , cyanogen bromide. Peptides sequenced by Edman degradation are labeled, Edman. Unidentified residues are designated by dots. Peptides sequenced by the technique of collision-activated dissociation on the triple quadrupole mass spectrometer are labeled by either the nominal monoisotopic muss of the corresponding (M + H)' ion, if particle bombardment with cesium ions was employed to ionize the sample, or by the rounded muss and charge of the ion subjected to collision-activated dissociation, if electrospray ionization was employed to ionize the sample. Unidentified residues are indicated by dots. Sequences confirmed by enzyme specificity and molecular mass measurements are indicated by dashed lines and therounded average mass value of the corresponding peptide (M + H)+ion. The FT label indicates that the mass measurement was recorded under particle bombardment conditions on a Fourier transform instrument. The letter, X , refers to either Leu or Ile, 2 amino acids of identical mass that cannot be differentiated on the triple quadrupole mass spectrometer. a, acetylated peptide. b, contains carboxyamidated cysteine.

reanalyzed by tandem mass spectrometry. Acetylation of a peptide shiftsthe corresponding (M H)+ ion to higher mass by 42 Da per amino group in the molecule. Potential amino groups include an unblocked N terminus and the eamino group in the side chain of Lys. Treatment of the peptide with acetic anhydride caused the (M H)+ ion to shift to higher mass by 42 Da. We conclude that thepeptide contains asingle amino group (on the side chain of the C-terminal Lys residue) and that the N terminus is N-acetyl-Ala rather than Ile or Leu. Residues 6-47 in the sequence were identified from collision-activated dissociation mass spectra recorded on (M H)' ions from several trypticand chymotryptic peptides. Information to overlap these peptides was provided by a molecular weight determination on a large tryptic peptide (residues 29-53) and by sequence analysis of two peptides obtained in poor yield by subdigestion of a large insoluble Glu-C fragment with elastase. Edman degradation of a piece produced by simultaneous digestion of the protein with AspN and Lys-C generated most of the sequence for residues 4873. Collision-activated dissociation spectra recorded on (M + H)+ ions from several elastase, trypsin, and chymotrypsin fragments identified missing residues from blank cycles in the Edman degradation and extended the sequence through residue 82. The remainder of the placental enzyme sequence was gen-

+

+

+

10859

erated largely by a combination of data generated from automated Edman degradation and collision-activated dissociation of multicharged ions from peptides ionized by electrospray ionization and introduced directly intothe mass spectrometer from a microcapillary HPLC column (23). By using the lattermethodology, many peptides containing more than 20 residues could be sequenced at the5-10 pmol level. Data from collision-activated dissociation mass spectra of three additional triply charged ions plus automated Edman degradation of two large fragments facilitated assignment of all but the last2 residues in the region from residue 99 to 157. Information to extend the sequence to the C terminus of the protein was obtained from a collision-activated dissociation mass spectrum recorded on the (M + H)' ion (m/z 744) of a 6-residue peptide generated in anelastase subdigest of a large Glu-C fragment of the placental enzyme. The first 4 residues of this peptide proved to be identical to the established sequence for residues 152-155. Since the C-terminal residue of this peptide, His, is not usually a cleavage site for either Glu-C or elastase, residue 157 was considered to be a likely candidate for the C terminus of the whole protein. Molecular mass measurements on the protein confirmed this hypothesis. The mass of the intact molecule determined from an electrospray ionization mass spectrum on the placental enzyme was 17,888f 8. The predicted average molecular mass for residues 1-157of the assigned primary structure given in Fig. 1 is 17,889 (average mass). The sequence given in Fig. 1 represents that of the major molecular form present in the placental enzyme preparation. However, a number of seemingly related but nonidentical peptides were found, apparently as contaminants at thelevel of approximately 10% of the amounts of the major peptides. Sequences of several of these peptides could be obtained. They were: T-1475, XVTDQNXSENWR T-956++, VDSAATSGYEXGNPPDYR; T-1104, HGXPMSHVAR, and C-1749, SHVARQXTKEDFATF. Subsequent investigations revealed that they corresponded perfectly to sequences predicted from a second type of isolated cDNA clone (see below). Since initial mass and sequence information recorded for peptides of the red cell enzyme indicated that its structure was similar or identical to that of the placental enzyme, not every peptide was sequenced. Shown in Fig. 2 is the electrospray ionization collision-activated dissociation mass spectrum recorded on (M 3H)3+ions from the tryptic peptide containing residues 76-97of the enzyme isolated from red cells. (An identical sequence was obtained for the corresponding peptide of the placental enzyme; see Fig. 1.) Predicted monoisotopic masses for fragment ions of type b and y are shown above and below the sequence in Fig. 2; those observed are underlined. Amino acids Ile and Leu are assigned in Fig. 2 as Lxx because they have the same mass and cannot be distinguished on the triple quadrupole instrument. Fragment ions of type y all contain the C terminus of the peptide plus 1,2,3, etc. additional residues (22). Subtractionof m/z values for any two fragments that differ by a single amino acid, NHCH(R)CO, and have the same number of charges generates a value that specifies the mass and thus the identity of theextra residue in the larger fragment. Ions of type y dominate the low mass end of the spectrum in Fig. 2 and allow the amino acid sequence to be read from residue 22 back to residue 9 in the sequence. Ions of type b provide additional sequence information. These all contain the N terminus of the peptide plus 1, 2,3, etc. additional residues (23). Subtraction of m/z values for any two fragments that differ by a single amino acid, NHCH(R)CO, and have the same number

+

10860

Cloning and Expression of Red Cell-type Phosphatase Thr

%LL? Lys Glu

Asp

2468.1

23670 223.0

71W.9

129.1

242.1

343.2

Gln

Lxx

2709.2

2581.2

~

~

~

m . 7

~

9

1572.6

8

p

P

l

19769 18459

Tyr LxxLxx

Cys

Met

Q

! 11110 ! ! 1 1% m

12115 1268p

7091.9

2221 0

2308.0

24221

Asp

Glu

Ser

Asn

lg&j125e6wm&3e.m 6 1

a

a

~

~

2535.1

Lxx m 2 l

~ Z x . 2

b,H"

2709.2

b,'

Arg z

S

l

q

.

'

Y#"

% Relative Abundance

IX I 00

1x40 1x15

1x20 1x10

I

lX151X5

100 7

*

I00

500

300

1x1 1x8

1x3

1x2.5

I

1 x 2 1x12

1x8 1x151x1.5

1x6

I

(Mt3H)+3 100 Bo

20 1loo

so0

700

1x10

1x15

1x25

1x10

1x35

Ix2W

I

phosphatase (BHPTP) clone.* After 3 x 10' Xgtll recombinants were screened, 10 putative positive clones were obtained. An initial restriction enzyme analysis indicated that 7 of these clones contained sequence information coding for partial or full length clones. The longest clone, HCPTP1-1, was found to have a 1.5-kilobase insert. Nucleotide sequence analysis showed that HCPTP1-1 contained an open reading frame of 474 nucleotides, coding for a protein of 158 amino acid residues. The open reading frame was terminated by a stop codon (TGA) at nucleotide 515. As shown in Fig. 3A, HCPTP1-1 also includes a 39-bp GC-rich 5"noncoding sequence and a 996-bp 3'-untranslated region. A putative polyadenylation signal, AATAAA, is present 32 nucleotides upstream of the polyadenylation tract in clone HCPTP1-1. A shorter clone, HCPTP1-2, lacked the 5'-noncoding region and initiation codon (ATG) of HCPTP1-1, but did contain an otherwise complete and identical coding sequence as well as the 3' end region. Except for HCPTP2-1 (tobe described below), the remainder of the shorter clones were found to be partial-length clones containing portions of the open reading frame, but with identical 3' regions, and having sequences that were identical to HCPTP1-1 and HCPTP1-2.However, clone HCPTP2-1 differed in a remarkable way. Clone HCPTP2-1 had a distinctive 108-bp sequence in the A

54

1

108

6

FIG. 2. Collision-activated dissociation mass spectrum recorded on (M+ 'H)'+ ions (m/z 903) produced from residues 74-97 of the red cell enzyme. An identical sequence was derived for the corresponding placental enzyme peptide; see Fig. 1. Sample (10 pmol) from a Glu-C digest of a large tryptic fragment was separated by microcapillary HPLC and introduced directly to the electrospray ionization source on the Finnigan TSQ-70 triple quadrupole mass spectrometer. Predicted monoisotopic masses for singly and doubly charged fragments of types b and y (22) derived from the deduced sequence are shown above and below the sequence a t the top of the figure; those observed in the spectrum are underlined. The abbreviation Lxx refers to either Leu or Ile, 2 amino acids of identical mass that cannot be differentiated on the triple quadrupole instrument.

of charges generates a value that specifies the mass and thus the identity of theextra residue in the larger fragment. Cleavage of a particular amide bond in the peptide can give rise to either or both and b y ions. If one is observed the other can be calculated by subtracting the mass of the observed fragment from the (M H)+ ion and then adding 1 Da. Abundantb ions having the predicted m/z values for Nterminal fragments terminating with residues 10-14 are observed in the spectrum. Additional members of this ion series allow the sequence to be read back to Lys at position 4. The absence of additional ions of type b and at y thelow and high mass ends of the spectrum, respectively, prevent further residue assignments at the N terminus of the peptide, a region found earlier in a chymotryptic peptide. Note that the spectrum in Fig. 2 contains doubly charged fragments of type b that allow the sequence to be read from residue 1 2 to 22 and thus provide confirmation of the sequence deduced from ions of type y. Cloning of HCPTP-Mass spectrometric sequence analysis quickly revealed that the human placental enzyme sequence was similar to thatof the bovine heart enzyme. Based on this similarity, it proved to be convenient to screen ahuman placental cDNA library using a DNA fragment encompassing the entire coding region of a bovine heart phosphotyrosyl

+

162

24

216

42

270

60

324

78

378

96

432

114

486

132

540

150 594

CAGCCTGACTAGACCCCACCCTCAGGTCCTGCAlll'CICAGTCGGTGTGTMTC

A C G T T C C A C G G C C C A M C C C G C ? C m C T T C A ~ G A ~ A C T G m C ? T A C C 648

T T I I A A A M C T M T T G T A G A T G C C A G T T G T G m G G C A G G A G M T C M T ~ 702 U.TGmCATTCAGACAGCTTTTATCGGGGTA~MGCA?TCTTAGACTA~GA 756 810 864 918 972 1026 1080 1134 1188 1242 1296 1350 1404 1458 1512 B 154 213 HCPTP1-1 MTTCGAGGGTAGACAGCGCGGCMC?TCCGGGTATGACATAGGGMCCCCCCTGACTAC HCPTP-A N U R V D S A A T S C Y E I G N P P D Y

I

I

: I 1

I

I

: I

I

I

HCPTP-B N U V I D S C A V S D U N V G R S P D P HCPTPZ-1 MTTCGGTCATTGACAGCCCTGCTGTTTCTGACTGGMCGTGGGCCGGTCCCCAGACCCA 273 214 HCPTP1-1 CGAGGCCACAGCTGCATCMGAGGCACCGCA~CCCATGAGCCACGTTGCCCGG~GATT HCPTP-A K G Q S C ~ K R H G I P ~ S H V A R Q

I

I I

:

I

l

l

I

I

I

I

I

I

HCPTP-B R A V S C L R N H C I H T A H K A R Q I HCPTP2-1 ACACCTCTGAGCIGCCTMGACATCGCATTCACACACAGCCCAT~C~GACAGATT

FIG. 3. A, complete nucleotide sequence of HCPTP1-1 and the inferred amino acid sequence corresponding to theopen reading frame of the cDNA encoding HCPTP-A. Nucleotides and amino acid residues are numbered on the right and left, respectively. Amino acids are numbered beginning with methionine rather than the alanine that is observed in the mature protein. The asterisk represents the termination site of the translatedsequence. B , nucleotide and deduced amino acid sequence for the nonidentical region of clones HCPTP11and HCPTP2-1; the respective protein sequences of HCPTP-A and HCPTP-B are also indicated. A line indicates identical amino acids, while two dots indicate aconservative substitution.

Cloning Expression and

of Cell-type Red Phosphatase

10861

middle of the open reading frame that was only 52% identical nitrophenyl phosphate as a substrate. The expression of the to thatof the corresponding segment of HCPTP1-1 (Fig. 3B). native enzyme used the PET-lld expression vector, as deIn addition, the 5“noncoding sequence and the codons for scribed under “Experimental Procedures.” The expression of the first 2 amino acids were absent in HCPTP2-1. However, HCPTPs was under the controlof the T7promoter, while the i t did possess a completelyidentical 3‘ end, and the remainderexpression of T 7 polymerase was controlled by the lac proof the translated region was identical, with the exception of moter. In constructing the expression vector, primer 1 prothe segment in the middle. Schematic diagrams for the struc- vided not only an NcoI restriction site and initiation codon tures of these cDNA clones,along with their restriction maps, (ATG) butalso codons for the first2 amino acids of HCPTP2are shownin Fig. 4. In an attempt to isolatea full-length 1. HCPTP2 clonewhich contained a 5”noncoding sequence, SDS-PAGEanalysis of totalproteins from cultures of labeled primer 6 was used to rescreen the library. Numerous transformed E. coli HMS174(DE3) showed that only plasmids additional HCPTPl and HCPTP2 clones were isolated, but containingtheHCPTPinserts caused the production of unfortunately, they all possessed a truncated 5’ end region. HCPTP, producing the expected prominent 18-kDa protein Although the complete N-terminal amino acidsequence band (Fig. 5A). The recombinant HCPTP-A and HCPTP-B could not be determined from the sequence of HCPTPB-1, proteins constituted approximately 20% of the soluble cell that information was obtained by mass spectrometric analysis proteins, as assessed for crude extracts by using SDS-PAGE. of authentic enzyme isolated from placenta and erythrocytes. The gel also showed that only relatively small amountsof low Except for the ambiguity between Ile and Leu, the complete molecular weight bacterial proteins ( M ,