Amino acid and cDNA sequences of a vascular ... - Europe PMC

0 downloads 0 Views 1MB Size Report
164-residue subunit characterized by direct amino acid se- quencing. ... blood vessel growth, or angiogenesis. The physiological role of .... reaction; RACE, rapid amplification ofcDNA ends. *To whom reprint ... The remaining. 60% was used to ...
Proc. Nadl. Acad. Sci. USA Vol. 87, pp. 2628-2632, April 1990 Biochemistry

Amino acid and cDNA sequences of a vascular endothelial cell mitogen that is homologous to platelet-derived growth factor (protein growth factor/evolution/neurobiology/cancer)

GREG CONN, MARVIN L. BAYNE, DENIS D. SODERMAN, PERRY W. KWOK, KATHLEEN A. SULLIVAN, THOMAS M. PALISI, DEBRA A. HOPE, AND KENNETH A. THOMAS* Department of Biochemistry, Merck Sharp & Dohme Research Laboratories, Rahway, NJ 07065

Communicated by P. Roy Vagelos, January 2, 1990

ABSTRACT Glioma-derived vascular endothelial cell growth factor (GD-VEGF) is a 46-kDa dimeric glycoprotein mitogen with apparently greater specificity for vascular endothelial cells than the well-characterized fibroblast growth factors. The GD-VEGF cDNA sequence encodes a 190-amino acid residue subunit that is converted, by removal of an aminoterminal hydrophobic secretory leader sequence, to the mature 164-residue subunit characterized by direct amino acid sequencing. The GD-VEGF homodimeric subunit is homologous to the platelet-derived growth factor A and B chains and its oncogene homologue v-sis.

GD-VEGF as described (5). Purity was confirmed on all samples by polyacrylamide gel electrophoresis in sodium dodecyl sulfate. Aliquots (1-2 ,.g) of the purified protein, quantitated by using an extinction coefficient based on amino acid analysis (5), were reduced and carboxymethylated with iodo[2-14C]acetic acid as described (8). The 14C-carboxymethylated GD-VEGF product was repurified on a 4.6 mm x 5 cm Vydac C4 reversed-phase HPLC column by elution at 20'C with a linear gradient of 0-67% acetonitrile in 0.1% trifluoroacetic acid over 30 min at a flow rate of 0.75 ml/min. Enzymatic Digestion and Polypeptide Purification. Reduced and carboxymethylated GD-VEGF (725 ng) was digested on the carboxyl-terminal side of most lysine and arginine residues with 30 ng of L-1-tosylamido-2-phenylethyl chloromethyl ketone-treated bovine pancreatic trypsin (Worthington) in 200 ,ul of 0.1 M ammonium bicarbonate (pH 8.3) for 6 hr at 37°C. The polypeptide digestion mixture was loaded directly on a 4.6 mm x 25 cm Vydac C18 reversed-phase column and fractionated by elution at 20°C with a 0-67% (vol/vol) linear gradient of acetonitrile in 0.1% trifluoroacetic acid over 2 hr at a flow rate of 0.75 ml/min. Polypeptide peaks were identified by monitoring A210 and individually collected. A similar digest was performed on 925 ng of carboxymethylated GD-VEGF by using 50 ng of Lys-C endoproteinase (Lys-C implies lysine specific) (Boehringer Mannheim) in 50 ,ul of 0.1 M Tris, pH 8.5/1 mM EDTA at 37°C for 8 hr. Polypeptide products, the result of cleavage on the carboxylterminal side of lysine residues, were purified as described for the tryptic digest. A final enzymatic digestion was done on 1.1 ,ug of carboxymethylated GD-VEGF by using 65 ng of Staphylococcus aureus V8 protease (Miles). The substrate was dissolved in 5 ,ul of 6 M guanidinium chloride/0.1% EDTA buffered with 0.7 M Tris to pH 7.8. The protease was added in 65 ,ul of 0.1% EDTA buffered to pH 8.0 with 0.1 M ammonium bicarbonate. The digest was incubated at 37°C for 48 hr, and the polypeptides, generated primarily by cleavage on the carboxylterminal side of glutamic acid residues, were purified on a C18 reversed-phase HPLC column as described above. Cyanogen Bromide Chemical Cleavage. Prior to cleavage by cyanogen bromide, 1.3 ,ug of reduced and carboxymethylated protein was treated with 2 M dithiothreitol at a final pH of 6.8 for 29 hr at 39°C to reduce any methionine sulfoxide to methionine (9). The product was repurified on a C4 reversedphase HPLC column as previously described, evaporated to dryness, redissolved in 200 ,ul of 40 mM cyanogen bromide

Vascular endothelial cell mitogenesis is required for sustained blood vessel growth, or angiogenesis. The physiological role of the most potent well-characterized endothelial cell growth factors, members of the heparin-binding fibroblast growth factor (FGF) family, is not entirely clear, since they are mitogenic for a broad spectrum of mesodermal and ectodermal cells. Furthermore, neither of the two prototypic FGFs, acidic and basic FGF, have identifiable leader sequences, so they might only be passively released from damaged or dead cells (1-3). Likewise, a 45-kDa platelet-derived endothelial cell growth factor has been identified and characterized that also lacks a recognizable secretory leader sequence (4). Therefore, although these leaderless growth factors could support vascular endothelial cell mitogenesis resulting from programmed cell death during embryonic development and cellular lysis as a consequence of tissue injury, they might not participate either in angiogenesis associated with normal growth or maintenance of a viable quiescent endothelium. Recently, we and others have identified a class of mitogens with apparently restricted specificity for vascular endothelial cells. These =46-kDa dimeric vascular endothelial cell growth factors (VEGFs) have been purified from conditioned medium of a rat glioma (GD-VEGF; ref. 5) and bovine pituitary folliculo stellate cells (VEGF/folliculo stellate-derived growth factor; refs. 6 and 7). Based on mass, subunit structure, and vascular endothelial cell specificity, these mitogens appear to be members of a distinct family of protein growth factors. We report here the complete cDNAt and amino acid sequence of rat GD-VEGF. The sequence confirms its identification as a secretory homodimeric glycoprotein and reveals an unexpected homology to platelet-derived growth factor (PDGF), a mitogen for connective tissues cells but not vascular endothelial cells from large vessels.

Abbreviations: VEGF, vascular endothelial cell growth factor; GDVEGF, glioma-derived VEGF; FGF, fibroblast growth factor; PDGF, platelet derived growth factor; PCR, polymerase chain reaction; RACE, rapid amplification of cDNA ends. *To whom reprint requests should be addressed at: Merck Sharp & Dohme Research Laboratories, Room 80W-243, P.O. Box 2000, Rahway, NJ 07065. tThe sequence reported in this paper has been deposited in the GenBank data base (accession no. M32167).

MATERIALS AND METHODS Reduction and Carboxymethylation of Purified Rat GDVEGF. Rat GS-9L-conditioned media were used to purify The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.

2628

Biochemistry: Conn et al. (Sigma) in 70% formic acid (Mallinckrodt), and incubated in the dark under argon for 24 hr at 20'C. The cleavage products were fractionated on a C18 reversed-phase HPLC column as described for the enzymatic cleavage products. Amino Acid Microsequence Determination. Amino acid sequencing was performed on an Applied Biosystems 470A microsequencer equipped with an Applied Biosystems 120A phenylthiohydantoin analyzer. Native GD-VEGF (200 ng, neither reduced nor carboxymethylated) was adsorbed onto an Immobilon-P polyvinylidene difluoride membrane for amino-terminal sequence determination. The purified polypeptide cleavage products were adsorbed onto a glass fiber filter coated with Polybrene (Pierce). Approximately 40% of the phenylthiohydantoin-amino acid product from each cycle was injected into the 120A phenylthiohydantoin detector for identification. Data storage and quantitative integration were performed with a Nelson analytical system. The remaining 60% was used to confirm identification of phenylthiohydantoin-[14C]carboxymethylcysteine by direct liquid scintillation counting. Most amino acid microsequence determinations yielded -10 pmol of free amino terminus. Polymerase Chain Reaction (PCR) Amplification of GDVEGF cDNA. Oligonucleotide primers for PCR amplification (10) were synthesized on an Applied Biosystems 381A synthesizer and purified on OPC columns (Applied Biosystems). Poly(A)+ RNA was isolated from GS-9L cells by using a Fast Track RNA isolation kit (Invitrogen, San Diego). First-strand cDNA template was prepared from 1 ug of the GS-9L poly(A)+ RNA primed with the oligonucleotide 5'-GACTCGAGTCGACATCGATT-- 1--111-i--TTTT-3' by using the Riboclone cDNA synthesis system (Promega). A degenerate PCR sense primer, 5'-TTlGTCGACTTYATGGAYGTNTAYCA-3' in which Y = T or C, based on amino acid sequence from peptide L42 (residues 42-47 of the complete sequence) and a degenerate PCR antisense primer 5'CAGAGAATTCGTCGACARTCNGTRTTYTTRCA-3' in which R = G or A, based on peptide T38 (residues 164-168) were used to amplify the central region of the GD-VEGF cDNA in a 100-,l reaction mixture containing 2 ng of first-strand cDNA, 50 pmol of each primer, and 2.5 units of Amplitaq polymerase (Perkin-Elmer/Cetus). The samples were placed in an automated heating/cooling block (DNA Thermal Cycler, Perkin-Elmer/Cetus) programmed for 40 cycles (1.0 min at 94°C, 2.5 min at 50°C, and 2.0 min at 72°C), followed by a single 10-min extension at 72°C. The 3' end of the GD-VEGF cDNA was amplified by the 3' RACE (rapid amplification of cDNA ends) technique as described by Frohman et al. (11). Two sense-strand oligonucleotides, 5'-TTTGTCGACGAAAATCACTGTGAGC-3' and 5'-TTTGTCGACTCAGAGCGGAGAAAGC-3', were synthesized based on the exact coding sequence of amino acid residues 139-144 and 146-151, respectively, as determined from the first PCR product and used as nested 3'-end PCR primers. These were used in combination with the antisense adapter primer 5'-GACTCGAGTCGACATCG-3' with the oligo(dT)-primed GS-9L first-strand cDNA as template. Reaction conditions were as described above with the addition of an initial cycle of 1.0 min at 94°C, 2.5 min at 50°C, and 40 min at 72°C. The 5' end of the GD-VEGF cDNA was amplified by the 5' RACE protocol as described by Frohman et al. (11). The cDNA template was prepared from 1 ,ug of GS-9L poly(A)+ RNA specifically primed with the antisense oligonucleotide 5'-CTTCATCATTGCAGCAGC-3' based on the exact coding sequence of amino acid residues 82-89 as determined from the first PCR product. The 5' end of the first-strand cDNA was tailed with dATP and terminal transferase. Two antisense-strand oligonucleotides, 5'-TTTGTCGACAACACAGGACGGCTTGAAG-3' and 5'-TTTGTCGACATACTCCTGGAAGATGTCC-3', based on the exact coding se-

Proc. Natl. Acad. Sci. USA 87 (1990)

2629

quences for amino acids 71-76 and 59-64, respectively, were designed as nested PCR primers and used in conjunction with the oligo(dT) sense-adapter primer 5'-GACTCGAGTCGACATCGA IIt1-t111IFIT I 1-3' and sense adapter 5'-

GACTCGAGTCGACATCG-3'. Reaction conditions were as described above for the 3' RACE. Subcloning and Sequencing. PCR products from 1-ml reaction mixtures were concentrated by Centricon 100 spin columns (Amicon) and purified by agarose gel electrophoresis. Following digestion with Sal I, the fragments were cloned into the Sal I site of pGEM3Zf(+) (Promega). Insert-positive plasmids were sequenced as double-stranded templates by the dideoxynucleotide chain-termination method (12) by using Sequenase (United States Biochemical). Amino Acid Sequence Homology Evaluation. The homology between GD-VEGF and the PDGF A and B chains was recognized by direct visual comparison, and its significance was evaluated by the Needleman and Wunsch algorithm (13, 14). A family of very similar alignments was generated with a PAM (accepted point mutation) matrix of 192 and gap penalty and bias values ranging from 10 to 30.

RESULTS Amino Acid Sequencing. The partial primary structure of GD-VEGF, determined from sequences of the amino terminus and polypeptides recovered after chemical cleavage with cyanogen bromide (CB on Fig. 1) and individual enzymatic digestions with trypsin (T), Lys C endoproteinase (L), and Staphylococcus aureus V8 protease (V), is shown in Fig. 1. The single amino terminus observed in the dimeric protein was confirmed by identical amino termini from tryptic (T27), V8 protease (VilA), and cyanogen bromide (CB26) peptides. Direct amino acid sequencing, based on a total of 5 ,ug of protein, was used to identify 143 residues and to confirm an additional 16 residues subsequently identified by cDNA sequencing. Cloning and Sequencing of PCR-Amplified cDNA. Degenerate PCR primers based on the amino acid sequence PheMet-Asp-Val-Tyr-Gln (residues 42-47) from polypeptide L42 and Cys-Lys-Asn-Thr-Asp (residues 164-168) from polypeptide T38 were used to amplify the central coding region of GD-VEGF cDNA. A single band migrating at =420 bp was gel-purified, digested with Sal I, ligated into Sal I-digested pGEM3Zf(+), and sequenced. Four independent clones gave the same DNA sequence excluding degeneracies in the primer sequences. The sequence designated p4238 in Fig. 1 codes for 116 amino acids excluding those encoded by the PCR primers. The sequence of p4238 was used to design unique sensestrand PCR primers to amplify the 3' end of the cDNA according to the protocol described by Frohman (11). The sequence generated by three independent clones is designated pW-3. This sequence overlaps the sequence of p4238 and codes for an additional 27 amino acids followed by a TGA termination codon. The sequence of p4238 was also used to design unique antisense-strand PCR primers to amplify the 5' end of the cDNA. The sequence generated by three independent clones is designated p5-15. This sequence overlaps p4238 and codes for an additional 47 amino acids including those identified by direct amino-terminal protein microsequencing of the mature subunit. The 5' ends of the longest PCR products terminate 4 bp upstream from an ATG initiation codon. The combined sequences of p4238, pW-3, and p5-15 code for 190 amino acids (Fig. 1). A typical secretory leader sequence is present within the amino-terminal 26 amino acid residues. The amino terminus of the mature 164-residue secreted subunit begins at Ala-27. The amino acid composition and polypeptide mass (19.2 kDa) calculated for this

2630

Proc. Natl. Acad. Sci. USA 87 (1990)

Biochemistry: Conn et al.

_-0 A ICC ATl AAC TIT CTS CTC TCT TOO ITI I MIT-R0N-PAI-LIU-LIU-OIA-TAPPA-

-5-13

CRC TIC ACC CTS OCT TTA CTS CTI TAC CTC CAC CAT 23 I AI0-IIPT~-TAA-LEA-ALA-IEl-LIA-Ll0-TYI-IIEUh-AI1-IS

ITC ACT ITO C16 ATC AT6 COO ATC Ann CCT CAC CIA SIC Cal CAC AIR 101 110

6IR 6I6 ITO

6C 120

UAL-TNA-MIT-OLN-I LIE-MET-All-I LIE- LYS-PR0-NI S-6LN-SEN-6LN-NI S-I LE- 6LY-ILU-MIT-SIAL42

~~~~Coll-1

_ I

ICC AAS TOO TCC C2l OCT ICA CCC ACI ACI SAR 666 lAO C8l AlA ICC CAT I"A ITO ITO 21 3. 46

r

LAL-LYI-TIP-SIl-ILN-ALl ALAPIOl-TAI-TAI-IUL-OLY-6LA-ILN-LIS-IAL-AII-6IL-PIL-PALT27

p4233 TTC CTI C6I CRT SIC Iia TOT 6"" TIC A6A CCI 886 RAN OAT I6U ICn A"l CCI 121 139

Li2

"AA AIT 140

PNE-LIE-ILN-NIS- SER-NN6-CYS-6LI-CYS-AlC-PII-LYS-LYS-ISP-IRN-TIR-LYS-PR0-6LU-ASNTIC

tB~~~~~~C26

-

T22

Ti _0_I-_aI_ _-! -Li _ _ LI I

L42

p4238

N-1

p5-15

_ .~

pill

AR6 TTC ATS 6AC ITC TIC CII CCC AIC TAT TIC CIT CCI ITT 686 ICC CTH ITO 6IC RTC Go

53

41

LVIS- PE-MET-lSP-AIL-TYI-ILN-AII-SER-TYA-CVS-IIO-PRO-ILE- 6LI-TNI-LIE-PRL-ISP- ILl_

_i __ __

~

T41

_ITG5s

_Ll_142

paw-

as

p4238 CIC TOT III CCT TOT TCI 6I6 C6I RIA 816 CIT 111 11 141 150

6TC C"i OAT CCI CII RCI TOT 160

NIS-CYS-6LU-PRO-CYS-SER-6LU-NRN6-AA-LVS-AIS-LE-PNIE-PIL-6LN-ASP-P10-ILN-TNA-CVS-

040

T22

TSR

-

-C920-mw I

2

|-

~~~~~U21

TTC C46 6A6 TIC CCC CAT 010 ATA 010 TAT ATC TTC AAO CCC TCC TOT 6TO CCC CTA ATO 30 41 10

PNE-6LN-6LU-TIR-PRO-IOP-6LV-IIE-6LU-TVN- ILIE-PE-LYS-PRO-SER-CVS-UAL-PAS-LEU-MITBAA TOT TCC TIC AAA "AC ACA 6AC TCI COT TIC AA6 6C6 All CAl CTT IA6 TTA AAC CA" 101 170 lAo LVS-CVS-III-CVS-LYS-NON-TNA-NSP-SER-AAO-CYS-LYS-ALN-NI6-ILN-IEI-ILO-LEU-ASN-IUL-

TGS

-Am-1

1-_

0U30

--~-

T38 U21

p4258 COS TOT SCC 66C TIC TIC ANT lIT OAR ICC CTI lAO TIC ITO CCC ICI TCO II6 SIC "IC

elg-CYo-ALD-ILY-CY0-CYS-ISN-IOIIU-L-L-CYS-IIA-PII-TAI-SII-CIL-SEI

--8so 0-I

--

6-

-a

-P 9Sw

--

6-

-0

-m.

-T3 l

p10-3 CIT ACT TIC AlA TOT 6AC ""C CCA RIO COO T6A 131 I99 AAO-TNA-CYS-AAI-CYS-ASP-LYS-P0-AAO-4RA *

_

Ll

FIG. 1. The amino acid and cDNA sequences of rat GD-VEGF. The full-length 190-amino acid residue translation product and its cDNA coding sequence are listed. The mature amino terminus begins at residue 27, immediately following a typical hydrophobic secretory leader sequence. A single potential N-glycosylation site exists at Asn-100. A total of 143 of the 164 amino acid residues of the reduced and carboxymethylated mature subunit, including the amino terminus and reversed-phase HPLC-purified products of tryptic (T), Lys-C endoproteinase (L), Staphylococcus aureus V8 protease (V8), and cyanogen bromide (CB) cleavages, were determined by direct microsequencing. All residues identified by amino acid sequencing are denoted below the sequence listing by single-headed arrows pointing to the right. Arrows immediately beneath the amino acid listing and to the right of the bracket before residue 27 mark the residues identified by sequencing the amino terminus of the whole subunit. Residues identified from the polypeptide cleavage products are indicated above the double-headed arrows spanning the lengths of the individual polypeptides. One listed pair of polypeptides, V18A and V18B, were sequenced as a mixture and, therefore, are only confirmatory of the cDNA-deduced amino acid sequence. The full-length coding region was determined from three sets of overlapping cDNA clones as described in the text. Each of the three DNA coding sequences (designated p5-15, p4238, and pW-3), excluding the primer regions at each end, is indicated by double-headed arrows above the corresponding nucleotide sequence.

cysteine-rich subunit agree well with our previously reported values (5). A single N-glycosylation site occurs at Asn-100. The mass, dimeric glycoprotein structure, and cysteine abundance of GD-VEGF are reminiscent of PDGF, a mitogen for a variety of connective tissue cells (15-17). Comparison of the amino acid sequences of the mature forms of rat GDVEGF and both the related A and B chains of human PDGF reveals a significant homology (6 SD from random similarity) as assessed with the Needleman and Wunsch algorithm (13, 14). The alignment (Fig. 2) was modified slightly to utilize information about the exon boundaries identified in the human PDGF A and B chain genes (18, 19). In the common regions of the mature subunits, =20% of the GD-VEGF residues are identical to the PDGF chains compared to nearly 60%o identity between the PDGF A and B chains. The area of greatest homology between GD-VEGF and the PDGF polypeptide chains corresponds to the minimum transforming region of v-sis, the oncogene homologue of the PDGF B chain (20). This highly homologous region is en-

coded in the PDGFs by exons 4 and 5 (18, 19). Within this area, the amino-terminal eight cysteine residues of GDVEGF align with the eight conserved cysteine residues that are common to both A and B chains. The remaining eight cysteine residues of GD-VEGF are all carboxyl-terminal to the eight homologous cysteines and occur in regions corresponding to the 3' end of exon 5 and all of exon 6. Exon 1 encodes the PDGF secretory leader sequences that are homologous not only to the leader of GD-VEGF but also, based on functional constraints, among many otherwise dissimilar proteins. In contrast, the region encoded by exon 2 appears to be either substantially truncated or entirely missing in GD-VEGF. The portions of PDGF polypeptides encoded by exon 3 show little, if any, homology either between each other or with GD-VEGF. The entire aminoterminal regions of the PDGF A and B chain primary translation products up to the carboxyl-terminal pairs of amino acid residues encoded by exon 3 are removed during proteolytic generation of the mature subunits.

Proc. Natl. Acad. Sci. USA 87 (1990)

Biochemistry: Conn et al. 1

10

2631

20

60-UE6F MESN PHEE3 LEU-SER- TRP-UAL-H I S- TRP-THR- LEU-RLR-HLE LEU- TYR-LEU- H I S- H I SP06F-B METASN ARRE-CYS-TRP-RLRE PHEL SER- LEU CYS LEUH RR6- LEUH HASERAP06FV- I MET-RR61THRE RLR-CYS LEU1j LE LEU- 6LYtCYS1 6LY TVR-ILEUE RLR-AHIS HAL LEU

ED-UE6F RLR6LY-RSP-PRO -PROE6LUE LEU-TYR METL SER- RSP- HI S- SER RREAES ELH-6LU-RLR-6LUILE-PRO RRE16LUJ URL-ILE- 6LUJ RRE LEU A- -RR6-SER-6LN ILE1 HIS SER

POEF-B

PDEF-A

2

ED-UEGF PD6F-B PHE-RSP RSP-LEU-6LN-RR6-LEU-LEU HIS-6LYwi PRO-6LY- 6LUH6fiRiF 6LY-RLR-SLUPDEF-R ILE- RR6ARSPLEU-6LN-RR6-LEU-LEU ELU-ILE- SER-URL-ELY-SERLURSP SER-LEU-RSPSP

2

3

6D-UE6F P6FV-B LEU- RSP RSN-MET-THR-RRE-SERPODF-R THR- SERA RR6-RLR- HIS-6LY- URL- HIS- RLR-THR- LYS

LYS-

LE

50

TRP-SER- SLN-RLRtRLRAPOTHRSER- ELY-6LY-6LH-LEU- 6LH-SERURL-PRO-6LU-LYS- RR6 E}LEU-

40

ED -UESF THR- ELU-6LV-ELU-ELN-LVS-ARL- HIS-6LH-HRL-AURL- LYS-PHE-MET-RSP-URL-TYR-6LN-RRE-SERP0EF-B LEU- RLRA-iELV-RRRE RE SER LEH-ELY-SER- LEH- THR- ILE-AA ELH PRO MET-ILE ALA PD6F-0 PRO-ILE- RE RR6-LYS RR tSERE ILELUH URL-PROAL A No _ol 450

60 RR6- PRO- I LE- 6LU- THR- LEU- URL-RPL

TYR1

P06F-B PD6F-A

PHE-

6LU *TYR

iiiRSP- 6LU-

6LU CY$S-LS THR-RRC-THR ELU-URL-PHE 6- ILE SERA-]RRE- LEU-I LE THR RSNHAL PRO-THR SERU0L; CYS-LYS-THR-RR-THR URL-ILE- TYR 6LU-ILE- PRO RE SERA URL

70 80 ED-UEEF ILE- 6LU-TYR- ILEE CS-HLPRO-LE-MERR6-CYS RLRLY-CYS-CYS-RSN P06F-B RLR-RSN-PHE-LEU URL1 TRP-PRO-PRO-CYS-URL-LU- URL SLN RR6-CYS SER 6LY-VCYS-CYS-RSNI P0EF-R |R0L0-0SN-PHE- LEUS ILE TRPHPROARRO-CYS-URL-ELUURL LYS RR6-CYS THRALY-CYS-CYS-RSN 4

90

100

EO-UE6F RSP-6LU-RAL-LEU-GLU CYS URLiPRO-THR SER- ELU-SER- RSN-URAL-THR-MET P0EF-B RSN-R0E-RSN U6LN CYS RR6 PRO-THR ELN UAL ELN-LEU RE PRO U - RR6E S PIEF-R THR-SEA-SER URL LY CYS ALNP HIS- HIS SERA

I LE-MET-RR6N- URLR LYS AA RLR

4

110

120

6E-UEEF F9 RSILE L PRO- HIS-6LN-SER-6LN- HIS-i P0EF-B

P0EV-R

ELY-6LU-MET-SER- PHE- LEU-ELN i ILESL ILE HLRE6-LYS- LYS- PRO ILE- PHEfi] LYS- RLR-THR HALTHR LEU-ELUHRSP HHISAiEI~TV URL-0RC-LVS-LYS- PRO LYS- LEH-L5J 6LU- URL-ELN HLARREILEU-6LU HIESLU

5 SB-UEEF

P0EF-S PO0E-A

150 140 SEA- RES CYE6LU CYS RRE-PRO-LYS- LYS- RSP-RR6-THR- LYS PRO ELU-RSN-HIS- CYS-GLU- PROALA CYS LYS CYS ELUH URL-RLR-RLRRLR-RR6 URL- THR- RR6- SERA GLYA CYS LRA lT LjLU- CY$ RL0 THR-SER-LEURSN RSP5

150

So-uEeV CY$ PVF-S P0EF-A

160

RRC-RR-LYS- HIS-LEU- PHE- URL ARSP-PRO-ELNjTHR L ELN-ARE-ALR-LYS- THR-PRO IGN THR-RR6-URL SLY SEA E TYR- AAE[!U-SJLU ASP-THR-6LY-RR6-PRO-RR6-ELU-SER-

CYS-LYS-CYS-SER- CYSILE- RR6-THR-URL- RRE-

I -~~~~~~~~~

170

El-PEEF LYS- RSN-THR- 0SP-SER- RR6RLAR 6LN-LEUP3EF-S UAL-AAE-RRE-PRO-PAO-LYS ELY LYS HIS ARE- PHE PUGF-A 6LY-LYS LYS RR6-1. RR6 190

SI-UEEF AAE-CYS-ASP-LVSPRO-RR6E3 P:::-: LU- S GLUJ1LEHU- ELY-RLR PO@F-A |LeU-LVIY PRO I |RSP-HURLEJ I

-6

_

7

The exon 6 region is dispensable for PDGF mitogenic activity, since it is not included in the minimal transforming region of the v-sis B-chain homologue and is eliminated in endothelial cell-derived A chain by mRNA alternative splicing (21, 22). Although the function is unknown, these sixth exons encode very basic nuclear targeting regions in the B chain and the A chain from a glioma cell line (23, 24). The

10 ELU-LEU- RSN-6LU-RRE THR CYSHIS- THR- HIS- RSP-LYS- THRI RLRRRE-

FIG. 2. Amino acid sequence homologies among rat GD-VEGF and the human PDGF A and B chains. The amino acid sequence numbering of GD-VEGF is given above its listing. Identical residues either between any two or among all three of the polypeptides are enclosed in boxes. The locations of the mature amino termini are identified by downward pointing arrowheads. Below the PDGF-A sequence is listed the numbers and boundaries of the PDGF exons. Either exon 6 or 7 (amino acids in brackets) is translated from alternatively spliced PDGF A chain mRNAs. The carboxyl-terminal three amino acids encoded by exon 7 of the A chain are translated following removal of the exon 6 termination codon. Although the three residues at. the carboxyl terminus of GD-VEGF are aligned with the exon 7-encoded region of the A chain based on identity of the carboxyl-terminal arginine residues, this tripeptide might not be in a separate exon even if the gene structure of GD-VEGF resembles those of the PDGFs. Moreover, little homology exists between GD-VEGF and the PDGFs after Cys-129, so the entire alignment in this carboxyl-terminal region is somewhat arbitrary.

equivalent region of GD-VEGF has little, if any, homology with the PDGF chains and is substantially less basic.

DISCUSSION The observations that (i) GD-VEGF is composed of two equivalently sized subunits, (it) only a single amino terminus

2632

Biochemistry: Conn et al.

is identified by direct microsequencing, and (iii) all of the many proteolytic fragments derived from it are contained within a single unique 164-residue region deduced from the cDNA cloning indicate that the protein is a homodimer. The cDNA sequence predicts a longer 190-residue precursor that contains 26 additional residues preceding the mature amino terminus. This amino-terminal extension contains a putative initiator methionine residue followed by a typical secretory leader sequence. An ACCATG Kozak sequence (25) places this methionine residue in a favorable context for initiation by eukaryotic ribosomes. Although an upstream termination codon has not been identified, this methionine residue appears to be an excellent candidate to serve as an initiator of translation. Since the most probable cleavage site for the secretory leader, predicted by the algorithm of von Heijne (26), occurs between residues 26 and 27, only a single proteolytic cleavage is probably required to generate the observed amino terminus. The termination codon following Arg-190 clearly identifies the carboxyl-terminal end of the protein. Since this arginine was identified by peptide sequencing, the generation of active protein does not apparently require processing at the carboxyl terminus. Therefore, no proteolysis other than the removal of the leader appears to be required to generate mature GD-VEGF subunits. Besides this limited proteolytic processing, the only recognized posttranslational modification of GD-VEGF is glycosylation. The previously observed lectin binding, presence of N-acetylglucosamine and glycosidase-mediated decrease in apparent mass (5) are consistent with the existence of the N-glycosylation site at Asn-100. Moreover, the presence of secretory leader and the observed glycosylation clearly identify GD-VEGF as a secretory protein, a property that might be expected of a physiologically important angiogenic growth factor. The potential secretory nature of GD-VEGF is in marked contrast to the well-characterized endothelial cell mitogens acidic and basic FGFs. Although these proteins are potent mitogens for vascular endothelial cells and are angiogenic in vivo, they are neither specific for these cells nor contain recognizable secretory leader sequences. Thus they appear to be broad-spectrum mitogens that might only be released by cellular leakage and lysis associated with tissue damage. GD-VEGF has distinct structural similarities to PDGF, a well-recognized mitogen for a variety of connective tissue cells. The area of maximum homology is found in the minimum transforming region of the PDGF structure. This section of sequence corresponds to the amino-terminal portion of GD-VEGF. All eight conserved cysteine residues common to the PDGF A and B chains are found in this minimum transforming region and are present in GD-VEGF. The additional eight cysteine residues are found in the carboxyl-terminal portion of GD-VEGF in a region of substantially less, if any, convincing homology. Since the PDGF A and B chains are substantially more similar to one another than to GD-VEGF, either they have diverged more recently or are evolving more rapidly. PDGF is mitogenic for fibroblasts and some cultures of microvascular endothelial cells but not those of macrovascular origin (27), whereas GD-VEGF induces mitosis of macrovascular endothelial cells but not fibroblasts (5-7). Therefore, not only the structures but also the functions of the PDGFs and GD-VEGF have clearly diverged. After submission of this paper, two reports appeared describing the sequence analyses both of bovine and human VEGF (28) and of human vascular permeability factor (VPF), a protein identical to VEGF that was originally identified based on its ability to induce vascular leakage rather than

Proc. Natl. Acad. Sci. USA 87 (1990)

vascular endothelial cell mitosis (29). Rat GD-VEGF is 90% identical in amino acid sequence to bovine and human VEGF/VPF. In addition to the form of human mitogen with a length equivalent to that of the rat GD-VEGF reported here, two human VEGF/VPFs have been identified that contain either a 44-amino acid deletion or a 24-residue insertion containing a highly basic region similar to the nuclear targeting regions of the PDGFs. The identification of rat VEGF in conditioned medium of a glioma indicates that this vascular endothelial cell mitogen can be associated with neural tumors and, based on its vascular permeability activity, might be capable of modulating the integrity of the blood-brain barrier in pathological and normal conditions. 1. Thomas, K. A. (1987) FASEB J. 16, 434-440. 2. Folkman, J. & Klagsbrun, M. (1987) Science 235, 442-447. 3. Burgess, W. & Maciag, T. (1989) Annu. Rev. Biochem. 58, 575-606. 4. Ishikawa, F., Miyazono, K., Hellman, U., Drexler, H., Wernstedt, C., Hagiwara, K., Usuki, K., Takaku, F., Risau, W. & Heldin, C.-H. (1989) Nature (London) 338, 557-562. 5. Conn, G., Soderman, D. D., Schaeffer, M.-T., Wile, M., Hatcher, V. B. & Thomas, K. A. (1990) Proc. Nat!. Acad. Sci. USA 87, 1323-1327. 6. Ferrara, N. & Henzel, W. J. (1989) Biochem. Biophys. Res. Commun. 161, 851-858. 7. Gospodarowicz, D., Abraham, J. A. & Schilling, J. (1989) Proc. Nat!. Acad. Sci. USA 86, 7311-7315. 8. Thomas, K. A., Rios-Candelore, M., Gimenez-Gallego, G., DiSalvo, J., Bennett, C., Rodkey, J. & Fitzpatrick, S. (1985) Proc. Nat!. Acad. Sci. USA 82, 6409-6413. 9. Houghten, R. A. & Li, C. H. (1977) Methods Enzymol. 47, 549-559. 10. Saiki, R. K., Scharf, S., Faloona, F., Mullis, K. B., Horn, G. T., Erlich, H. A. & Arnheim, N. (1985) Science 230, 13501354. 11. Frohman, M. A., Dush, M. K. & Martin, G. R. (1988) Proc. Nat!. Acad. Sci. USA 85, 8998-9002. 12. Sanger, F., Nicklen, S. & Coulson, A. R. (1977) Proc. Natl. Acad. Sci. USA 74, 5463-5467. 13. Needleman, S. B. & Wunsch, C. D. (1970) J. Mol. Biol. 48, 443-453. 14. Dayhoff, M. O., ed. (1978) Atlas of Protein Sequence and Structure (Natl. Biomed. Res. Found., Washington), Vol. 5, Suppl. 3, pp. 345-358. 15. Ross, R. (1987) Annu. Rev. Med. 38, 71-79. 16. Deuel, T. F. (1987) Annu. Rev. Cell Biol. 3, 443-492. 17. Hannink, M. & Donoghue, D. J. (1989) Biochim. Biophys. Acta 989, 1-10. 18. Rao, C. D., Igarashi, H., Chiu, I.-M., Robbins, K. C. & Aaronson, S. A. (1986) Proc. Natl. Acad. Sci. USA 83, 2392-

23%.

19. Bonthron, D. T., Morton, C. C., Orkin, S. H. & Collins, T. (1988) Proc. Nat!. Acad. Sci. USA 85, 1492-14%. 20. Sauer, M. K., Hannink, M. & Donoghue, D. J. (1986) J. Virol. 59, 292-300. 21. Tong, B. D., Auer, D. E., Jaye, M., Kaplow, J. M., Ricca, G., McConathy, E., Drohan, W. & Deuel, T. F. (1987) Nature (London) 328, 619-621. 22. Collins, T., Bonthron, D. T. & Orkin, S. H. (1987) Nature (London) 328, 621-624. 23. Lee, B. A., Maher, D. W., Hannink, M. & Donoghue, D. J. (1987) Mol. Cell. Biol. 7, 3527-3537. 24. Maher, D. W., Lee, B. A. & Donoghue, D. J. (1989) Mol. Cell. Biol. 9, 2251-2253. 25. Kozak, M. (1986) Cell 44, 283-292. 26. von Heijne, G. (1986) Nucleic Acids Res. 14, 4683-4690. 27. Bar, R. S., Boes, M., Booth, B. A., Dake, B. L., Henley, S.-& Hart, M. N. (1989) Endocrinology 124, 1841-1848. 28. Leung, D. W., Cachianes, G., Kunag, W.-J., Goeddel, D. V. & Ferrara, N. (1989) Science 246, 1306-1309. 29. Keck, P. J., Hauser, S. D., Krivi, G., Sanzo, K., Warren, T., Feder, J. & Connolly, D. T. (1989) Science 246, 1309-1312.