Human Fibroblast Collagenase - The Journal of Biological Chemistry

15 downloads 52 Views 2MB Size Report
Nov 11, 1985 - and James Sacchettini for their assistance with peptide isolation for the amino ... Macartney, H. W., and Tschesche, H. (1983) Eur. J. Biochem.
THEJOURNAL OF BIOLOGICAL CHEMISTRY 0 1986 by The American Society of Biological Chemists, Inc.

Vol. 261, No. 14,Issue of May 15, pp. 6600-6€45,1986 Printed in U.S.A.

Human Fibroblast Collagenase

(Received for publication, November 11,1985)

Gregory I. Goldberg, ScottM. Wilhelm, Annemarie Kronberger, Eugene A. Bauer, Gregory A. Grant, and ArthurZ. Eisen

-

From the Division of Dermatolom. DeDartment of Medicine, Washington University School of Medicine, St. Louis, Missouri 63110 I“,

We have determined the complete sequence of the tial collagens, differs from the other interstitial collagenases cDNA clone representing thefull size human skin col- immunologically (221, insubstrate preference (23),andin lagenase mRNA. Collagenase is synthesizedinpremolecular weight (15, 24), indicating that tissue differences proenzyme form,M, 54,092, with a 19 amino acid long in human interstitial collagenases exist. All these enzymes signal peptide. The primary secretion products of the catalyze a single specific cleavagein each of the threecollagen enzymeconsistof a minorglycosylatedform, M, polypeptide chains rendering the collagen fiber soluble, ther57,000, and a major unmodified polypeptide of premally unstable, and susceptible to attack by specific gelatidicted M, 51,929. Proteolyticactivationof humanskin nases (16,25) and perhapsby other tissue proteases. procollagenase results in removal of 81 aminoacid Collagenase from human skin fibroblasts has been purified of the proenresidues from the amino-terminal portion (17, 18) and characterized enzymatically (12) in this laborazyme.Bothpotential N-glycosylation sites are contained within the proteolyticallyactivated form of the tory. The proenzyme is secreted as two closely related polyenzyme. The primary structure of the coding region of peptides, 57 and 52 kDa, both of which can be activated by several different mechanisms (26, 27) to produce active enthepresentedclone is homologousto’anoncogeneinduced rat protein whose function is still unknown, zyme. Here we present the complete primary structure of the although preliminary observations suggest that it is enzyme, predicted from the sequence of a cDNA clone representing the collagenase mRNA. The dataconfirm our prelimnot ratskin collagenase. inary observations that the appearance of the minor 57-kDa collagenasepolypeptide is the result of partial N-glycosylation of the major 52-kDa specie.’ In addition, comparative analysis Collagens constitute the most abundant proteins of the of the human skin collagenase cDNA sequence reveals hoextracellular matrix in mammalian organisms. Though col- mology to a recently reported sequence of an unidentified rat lagen turnover is generally very slow, its metabolism intensi- protein (28) whose transcription can be induced by treatment fies dramatically concomitant with processes requiring re- with epidermal growth factor as well as by oncogenetransformodeling of the connective tissue, such as uterine involution mation. (I), bone resorption (2), and wound healing (3, 4). Enhanced collagen metabolism has also been implicated in the pathoMATERIALS AND METHODS genesis of a number of diseases which include recessive dysCell Culture, Enzyme Purificatwn, and Preparation of Cytoplasmic trophic epidermolysis bullosa (5, 6), rheumatoid arthritis (7, RNA-Conditioned medium from adult skin fibroblasts (WUN 80547 8), corneal (9, lo), andgingival disease (11).The initiation of cell strain) was collected every 48-72 h, and procollagenase was the dismantling of an existing collagen network requires spe- purified as described previously (17). Collagenase protein was meascific enzymes, collagenases, which catalyze the initial step in ured by indirect enzyme-linked immunosorbent assay (29). Fresh medium was added to 70% confluent cell cultures 24 h before harthe proteolytic degradation of collagen. Several types of collagenase can be distinguished based on vesting for RNA isolation. Total cytoplasmic RNA from the harvested fibroblast cells was isolated as recently described (30). Poly(A)+RNA their physical properties and substratespecificity for different was prepared by oligo(dT)-cellulosechromatography (31) and used in types of collagen. Collagenases degrading the interstitial col- a reticulocyte lysate cell-free translation system, for Northern blot lagens, types I, 11, and 111, do not cleave collagen types IV and analysis, primer extension reactions, and construction of a cDNA V (12), which apparently aredegraded by other proteases (13- library. Protein Sequencing-An S-carboxymethylated preparation of hu16). Thestructural relationship among these functionally different collagenases remains to be determined. The inter- man skin fibroblast collagenase was subjected to cyanogen bromide cleavage or trypsin digestion as previously described (32). Cyanogen stitial collagenases from human skin (17, 18), synovium (19), bromide peptides were separated on a Varian 5000 HPLC using a 4.6 gingiva (20), and monocytes (21) comprise a group of metal- mm X 25 cm Beckman-Altex ODs (5 pm) reverse phase column, loendoproteases which appear to be similar, if not identical. equilibrated in 0.05% trifluoroacetic acid, and developed with linear Human granulocyte collagenase, which also degrades intersti- gradients of acetonitrile (1,0.5, or 0.25%/min) in 0.05% trifluoroacetic

* This study was supported by National Institutes of Health Grants AM-12129, AM-19537, TO-AM07284, and in partby the Monsanto Company/Washington University Biomedical Research Agreement. The costs of publication of this article were defrayed in part by the payment of page charges. This articlemusttherefore be hereby marked “advertisement’’ in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

acid or in a gradient of 0.5%/min isopropanol. Peptides produced by digestion with trypsin were fractionated using the same system with similar gradients and 3.9 mm X 30 em Waters C-18 microbondapak column. Sequence analysis of polypeptides was performed by auto-



Wilhelm, S. M., Eisen, A. Z., Teter, M., Clark, S. D., Kronberger, A. M., and Goldberg, G., (1986) Proc. Natl. Acad. Sci. U. S. A., in press.

6600

Human Fibroblast Collagenase mated Edman degradation on either a Beckman 890C spinning cup Sequencer using a standard 0.33 M Quadrol program or on an Applied Biosystems 470 A gas phase Sequencer. The phenylthiohydantoins, after conversion from the phenylthiazolinones, were identifed by reverse phase high pressure liquid chromatography on a BeckmanAltex Ultrasphere ODs-PTH column (33). Alternatively, enzyme preparations (100-300pg)were size fractionatedon NaDodSO4PAGE? The proteins were electroblotted onto an activated glass fiber sheet and stained with Coomassie Blue (34). The protein bands were cut out from the blot and placed directly in the cartridge of the gas phase sequenator. Primer Extenswn Reaction-Five pg of mRNA and 0.1-0.5 pmol of 32Pend-labeled synthetic oligonucleotide primer in 0.1 ml of 0.1 M NaC1,lO mM Tris-HC1,pH 7.5,lmM EDTA (Buffer A) was extracted with phenol/chloroform/isoamyl alcohol (49:492) and precipitated with ethanol. The pellet was dried, resuspended in 5 pl of 1M NaC1, 0.1 M Pipes, pH 6.4,2.5 mM EDTA, and the primer annealed to mRNA for 3 h a t 40 "C, a temperature slightly below the T,,, for the primers SO3 and S06. At the endof the hybridization, 5 pl of buffer containing 250 mM Tris-base, 80mM MgCl2,4 mM dithiothreitol was added. Then 5 pl of a 10 mM solution of each dATP, dGTP, dCTP, and dTTP, and 33 pl of HzO was added and the pH of the mixture adjusted to 8.3. Reverse transcriptase (Life Sciences) was added to obtain a concentration of 10 units/pg of mRNA. The mixture was incubated for 1 h a t 42 "C, and thereaction was stopped by addition of 150 pl of Buffer A and 2 p1 of 10% Na DodSO4. The reaction mixture was extracted with phenol/chloroform and RNA hydrolyzed by addition of NaOH to a final concentration of 0.5 M. After hydrolysis for 0.5 h a t room temperature, the reaction mixture was neutralized and precipitated with ethanol. The dry pellet was resuspended in formamide and electrophoresed on a denaturing 8% polyacrylamide, 8.3 M urea gel. Reaction products were isolated from the gel and sequenced using the partial chemical degradation method (35). Construction of cDNA Library-Human skin fibroblast mRNA was used to construct a cDNA library using a modification (36) of the Okayama and Berg procedure (37). The library represents 1.5 X lo6 original cloning events obtainedfrom 12 pg ofmRNA. The transformants were amplified on agar plates containing100 pg/ml of ampicillin. The colonies were scraped from the plates and grown in M9 media supplemented with 0.2% casaminoacids for two generation times. The total library plasmid DNA was isolated (38) and size-fractionated on 1%agarose gel in a supercoiled form. The fractions of supercoiled DNA migrating above the supercoiled vector were extracted from the gel and used to retransform host bacteria. Transformants obtained from the fraction-containing inserts of the desired size range were plated a t lo4 colonies/100 cm2 square petri dish and screened for hybridzation (39) with synthetic oligonucleotides (see hybridization conditions). Northern Blot Analysis of RNA-A total mRNA preparation (5 pg) was fractionated on 1.2% agarose gel containing 2.2 M formaldehyde (40) and transferred to nitrocellulose filters (41). Hybridization C ~ n d i t w n s - ~ ~End-labeled P synthetic oligonucleotides (17 bases long), 1X lo6cpm/ml, were hybridized to nitrocellulose filters for 18-36 h in a solution containing 0.9 M NaCl, 0.09 M Na citrate, 0.5% NaDodS04 or 0.5% Nonidet P-40, 30 pg/ml of poly(A), 25 pg/ml of tRNA, 0.1% Ficoll, 0.1% bovine serum albumin, 0.1% polyvinylpyrrolidone. Filters were washed 3 times in a solution of 0.45 M NaCl, 0.045 M Na citrate, 0.1% NaDodS04 a t room temperature for 5 min, once in a solution of 0.9 M NaCl, 0.09 M Na citrate, 0.1% NaDodS04 a t 40 "C for 2-5 min, and twice with 0.45 M NaCl, 0.045 M Na citrate, 0.1% NaDodS04 a t room temperature. Nick-translated plasmid DNA (1 X lo6 cpm/ml) was hybridized to Northern blot for 18 h a t 42 "C in a solution of 50% formamide (v/ v), 0.75 M NaCl, 0.075 M Na citrate, 50 mM Na phosphate, pH 6.5, 0.1% Ficoll, 0.1% bovine serum albumin, 0.1% polyvinylpyrrolidone, 0.1% Nonidet P-40, 30 pg/ml of poly(A), and 50 pg/ml of denatured Escherichia coli DNA. Filters were washed 2-3 times in 0.3 NaCl, 0.03 M Na citrate, 0.1% Nonidet P-40 for 5 min at room temperature and then 3 times for 5 min at 50 "C in 0.03 M NaCl, 0.003 M Na citrate, 0.1% Nonidet P-40. Cell Free Translation-0.5-1pg of mRNA was translated in a rabbit reticulocyte lysate cell-free system (Promega Biotec) with 5 pl of [35S]methionine(1070 Ci/mmol, 15 pCi/pl, Amersham) in a 50-p1 The abbreviations used are: NaDodS04-PAGE, sodium dodecyl sulfate-polyacrylamide gel electrophoresis, Pipes, 1,4-piperazinediethanesulfonic acid.

Structure Primary

6601

reaction mix tc yield a total incorporation of2.5 X lo6 cpm into trichloroacetic acid insoluble protein. Translation products were visualized after fractionation on 10% NaDodSO4-PAGE (42) and autoradiography (43). Immunoprecipitation with collagenase specific or nonimmune rabbit IgG was performed as described previously (44). RESULTS

Identification of the Single mRNA Species Coding for Human Skin Collagenase-Human skin collagenase, as well as collagenases from a variety of other tissues, is secreted in the form of a zymogen (see above). This primary secretion product is represented by two protein species, a minor 57-kDa form and a major 52-kDa form (17, 18). Bothproteinscan be activated and shown to display collagenolytic activity with similar catalytic properties (17). It has been suggested (45) that the appearance of the higher molecular weight form of rabbit synovial collagenase is a result of partial glycosylation of the single proenzyme species. Our recent observations on human skin collagenase provide evidence to support this conclusion. We have shown' that anticollagenase antibody reacts with both forms of the proenzyme and that the synthesis of the minor 57-kDa zymogen form is selectively inhibited in thepresence of tunicamycin. When a collagenase preparation containing predominantly the minor glycosylated form of the enzyme is treated with endoglycosidase F, it is quantitatively converted into a form indistinguishable on NaDodS04-PAGE from the unmodified (52 kDa) enzyme. The in uitro translation system, using mRNA prepared from WUN 80547 cells and reticulocyte lysate, yields a single immunoprecipitable band (Fig. 1A) of 54 kDa which does not comigrate with either of the mature procollagenase species, apparently due to thepresence of an uncleaved signal peptide. We have established that the amino-terminal portions of both proteolytically activated forms of the enzyme have an

94

-

- 285

68 -

4

I-

~

e o - 185

-

43-

30-

21

1

21

32

FIG. 1. Cell-free translation and Northern blot analysis of mRNA derived from humanskin fibroblasts.A, one pg ofmRNA was translated in a rabbit reticulocyte lysate cell-free system. [35S] Methionine-labeled proteins were size-fractionated on a 10% NaDodS0,-PAGE before (lane I) and after immunoprecipitation with collagenase-specific (lane 2) or nonimmune IgG (lane 3). The migration positions of the 57- and 52-kDa procollagenase species are indicated by arrows. The numbers on the left represent molecular weight markers (MIX B, Five pgof mRNA were electrophoresed through a 1.2% agarose gel containing 2.2 M formaldehyde blotted on a nitrocellulose filter and hybridized to the 32Pend-labeled synthetic oligonucleotide SO3 (lane I) or to nick-translated pCol 185.2 plasmid DNA (lane 2). Lane I was exposed 24 times longer than lane 2.

6602

Human Fibroblast Collagenase Primary Structure

identical amino acid sequence. Sequence analysis of the intact S-carboxymethylated human skin procollagenase indicated that the amino terminus was blocked. Both the 57- and 52kDa proenzyme forms can be activated by limited digestion with trypsin generating their respective 47- and 42-kDa active enzyme forms. To determine the amino-terminal protein sequence of these polypeptides, a purified preparation of procollagenase was subjected to affinity chromatography on blue Sepharose as previously described (18).The fractions enriched for the minor 57-kDa and the major 52-kDa proenzyme species were pooled separately and subjected to trypsin activation. The activation products were then separated by NaDodS0,-PAGE, electroblotted onto glass fiber filters (see “Materials and Methods”), and subjected to amino acid sequence analysis. The amino-terminal sequence of each of these polypeptides was found to be identical (Fig. 2). The identity of the amino-terminal sequence of both the minor 47-kDa and themajor 42-kDa activated enzyme species obtained after NaDodS04-PAGE separation also suggested that purified preparations of collagenase contain no major contaminating proteins. Therefore, further analysis of the primary structure of the proenzyme protein was undertaken. Six peptides resulting from cyanogen bromide cleavage and two peptides from digestion with trypsin were purified and sequenced (Fig. 2). The cyanogen bromide peptide CN2 was reverse translated and a mixture of 32, 17-bases long oligonucleotides was synthesized. The sequence of this mixed probe SO3 (Fig. 3) is complementary to anmRNA coding for the 3’

proximal portion of the CN2 peptide. Northern blot analysis of the mRNA prepared from human skin fibroblast cultures showed that the synthetic oligomer probe hybridized to a 2.5-kilobase mRNA species (Fig. 1B). To determine whether this mRNA specie coded for the collagenase protein, we have shown that themRNA hybridizable to SO3 also codes for the amino-terminal protein sequence shared by both enzyme forms (Fig. 2). The SO3 oligomer was 5‘ end-labeled, annealed to a preparation of total skin fibroblast mRNA, and the hybrid subjected to the avian myeloblastosis virus reverse transcriptase-catalyzed primer extension reaction. The single major reaction product (670 bases) was isolated from a polyacrylamide-urea denaturing gel and sequenced. The sequence of the SO3 extension confirmed the protein sequence of the upstream portion of the CN2 peptide and was used to synthesize an additional 17-bases long oligonucleotide probe, S06, (Fig. 3) complementary to the same mRNA and positioned 180 bases upstream. When this oligomer was utilized as a primer in a similar experiment, a single 435-bases long primer extension product was isolated, sequenced, and shown to contain the coding sequence for 8 amino terminus proximal amino acids found in bothactivated enzyme forms. These primer extension sequencing data provide evidence that the single mRNA species coding for the collagenase-derived peptide CN2 also codes for the aminoterminal sequence of both trypsin-activatedenzyme forms. Clone pCol 185.2 Represents Complete mRNA Coding Sequence for Human Skin Collagenase-Having identified the collagenase coding mRNA, we constructed a cDNA library CNl KQXRCGVPDVAQFVLTEGNPXWEQT 12 88-1 (36, 37) from collagenase-producing, human skin fibroblast CN2 ISFVRCDHRDNSPFDGPCGNLAHAFQPGPGIGXDA~ mRNA. The library represents 1.5 x lo6 original cloning events, with 75-80% of 161-208 the clones carrying inserts. The total -EXXTNNFR library plasmid DNA was isolated and size-fractionated on CN3 YPSYTFSGDVQLAQDXIDGIQAIYG 237-261 1%agarose gel in a supercoiled form. Fractions containing CN4 inserts of at least 2 kilobases or greater were extracted from the gel and used to retransform the host bacteria. TwentyCN5 XTNPFYPEVELHFISVFWPQLPNGLEAAYEFADXDEVXFFX two transformants, hybridizing to both probes SO3 and S06, 1 GNKYWAVQCQNVLHGYP were purified. The DNA from these clones was analyzed for the size of the 5’ proximal region of the insert. Clones were FFHCTRQYKFDP CN6 selected which had the longest DNA fragment between the LTFTK TP1 EcoRl site in the linker of the vector and the EcoRl site S TP2 positioned 270 nucleotides downstream from the 5’ end of the mRNA predicted from the sequence of the SO6 primer extenTP47,42 FIG. 2. Amino acid sequence of the peptides derived from sion product. The clone pCol 185.2 contains a 1970-bp insert human skin procollagenase. Peptides derived by cyanogen bro- excluding the oligo(G) and poly(A) tails. Northern blot analmide cleavage (peptides CN 1-6) or trypsin digestion (peptides TP 1, ysis (Fig. 1B) showed that this clone hybridized to the same 2) of a purified preparation of procollagenase were separated and mRNA specie as probe S03. It is of interest to note that the sequenced. The protein sequence of the amino-terminal portion of partial cDNA clone of rabbit synovial collagenase hybridizes the 47- and 42-kDa proteolytically activated collagenase species (peptides TP 47, 42) was obtained by sequencing the individual forms to a similar size of rabbit mRNA (46). The complete sequence of the pCol 185.2 insert has been after separation by NaDodSOr-PAGE and electroblotting onto an activated glass fiber sheet (34). The numbers represent the positions determined and confirmed on both strands (Fig. 4). The 3’ of the amino- and carboxy-terminal amino acids of each peptide in end nucleotide of the probe SO6 was positioned 435-bp downthe collagenase protein according to the nomenclature in Figs. 4 and stream from the 5’ end of the insert. This is in good agreement 5. The unidentified amino acid residues are designated as X. The with the size of the SO6 primer extention product (435 bases), underlined amino acid sequence of peptide CN2 was reverse-translated to determine the sequence of the synthetic oligonucleotide S03. indicating that the pCol 185.2 insert represents the full, or nearly full, sequence of human collagenase mRNA. The insert consists of a 68-bp 5’ untranslated leader, followed by an TKATCTTCATCAAAAT SO3 671-655 C CG G G initiating ATG Met codon, and 1,407 nucleotides coding for a 469 amino-acid-long preprocollagenase protein of M , 54,092. SO6 CGTCTAATTTTCAATCC 434-418 termination FIG.3. Nucleotide sequenceof the synthetic oligonucleotide The coding sequence is followedbytwoTGA probes. The synthetic oligonucleotides were synthesized using an codons positioned in frame. The 3‘ untranslated region inApplied Biosystems DNA synthesizer. The sequence of probe SO3 cludes 492 bp between the first termination codon and the was predicted by reverse translation of the amino acid sequence of start of the poly(A) tail, with a putative poly(A) addition peptide CN2 (Fig. 2). The sequence of probe SO6 was obtained from signal 463-bp downstream from the end of the coding sethe sequence of the SO3 primer extension product. The sequence of each probe is complementary to the coding strand sequence of the quence. The sequence surrounding the initiating codon is in agreeclone pCol 185.2 at thedesignated positions (Fig. 4).

Human Fibroblast Collagenase Primary Structure

6603 Bt!

ATATTGGAGCAGCAAGAGGCTGGGAAGCCATCACTTACCTTGCACTGAGAAAGAAGACAAAGGCCAGT

4

~ T ~ C ~ C A G C T T T C C T C C A C T G C T G C ~ G C T G C T G C T G T T C T G G G G T G T G G T G T C T C A C A G C T T C C C A G C G A C T C T A G A A A C A C A A G A G C A A G A T G T G167 GACTTA M H S F P P L L L L L F W G V V S H S F P A T L E T Q E Q D V O L GTCCAGAAATACCTGGAAAAATACTACAACCTGAAGAATGATGGGAGGCAAGTTGAAAAGCGGAGAAATAGTGGCCCAGTGGTTGAAAAATTGAAGCAA V Q K V L E K V Y N L K N D G R Q V E K R R N S G P V V E K L K Q

266

ATGCAGGAATTCTTTGGGCTGAAAGTGACTGGGAAACCAGATGCTGAAACCCTGAAGGTGATGAAGCAGCCCAGATGTGGAGTGCCTGATGTGGCTCAG M Q E F F G L K v T G K P D A E T L K V M K Q P R & G V P D V A Q

365

TTTG:CCTCACTGAGGGAAACCCTCGCTGGGACCAAACACATCTGAGGTACAGGATTGAAAATTACACGCCAGATTTGCCAAGAGCAGATGTGGACCAT F V L T E G N P R W E Q T H L R V R I E ~ ~ P D L P R A D V

D

464 H

~CCATTGAGAAAGCCTTCCAACTCTGGAGTAATGTCACACCTCTGACATTCACCAAGGTCTCTGAGGGTCAAGCAGAC~TCATGATATCTTTTGTCAGG 5 6 3 A I E K A F Q L W S ~ [ P L T F T K V S E G Q A D I M I S F V R

FIG. 4. Nucleotide sequenceof human skin collagenase cDNA.The sequence was determined and confirmed on both strandsusing the partialchemical degradation method (35). The predicted amino acid sequence of human skin collagenase is shown under the DNA sequence. The putative site of the signal peptide cleavage is shown (arrow). The amino terminus of the proteolytically activated enzyme form is indicated by a star. Potential N-glycosylation sites are designated by boxes. 3 cysteine residues and the potentialpoly(A) addition signal sequence are underlined.

662

GGAGATCATCGGGACAACTCTCCTTTTGATGGACCTGGAGGAAATCTTGCTCATGCTTTTCAACCAGGCCCAGGTATTGGAGGGGATGCTCATTTTGAT G D H R D N S P F D G P G G N L A H A F Q P G P G I G G D A H F

O 761

GAAGATGAAAGGTGGACCAACAATTTCAGAGAGTACAACTTACATCGTGTTGCGGCTCATGAACTCGGCCATTCTCTTGGACTCTCCCATTCTACTGAT E O E R W T N N F R E V N L ~ R V A A H E L G H S L G L S H S T D

860 ATCGGGGCTTTGATGTACCCTAGCTACACCTTCAGTGGTGATGTTCAGCTAGCTCAGGATGACATTGATGGCATCCAAGCCATATATGGACGTTCCCAA

I

G

A

L

M

Y

P

S

V

T

F

S

G

D

V

Q

L

A

Q

D

D

I

D

G

I

Q

A

I

V

G

R

S

Q 959

AATCCTGTCCAGCCCATCGGCCCACAAACCCCAAAAGCGTGTGACAGTAAGCTAACCTTTGATGCTATAACTACGATTCGGGGAGAAGTGATGTTCTTT N P V O P I G P Q T P K A ~ D S K L T F O A I T T I R G E V M F

F

A A A G A C A G A T T C T A C A T G C G C A C A A A T C C C T T C T A C C C G G A A G T T G A G C T C A A T T T C A T T T C T G T T T T C T G G C C A C A A C T G C C A A A T G G G C T T G A A G C T1 0 5 8 K D R F V M R T N P F V P E V E L N F I S V F W P Q L P N G L E A 1151

GCTTACGAATTTGCCGACAGAGATGAAGTCCGGTTTTTCAAAGGGAATAAGTACTGGGCTGTTCAGGGACAGAATGTGCTACACGGATACCCCAAGGAC

A

Y

E

F

A

D

R

O

E

V

R

F

F

K

G

N

K

V

W

A

V

Q

G

Q

N

V

L

H

G

V

P

K

D

ATCTACAGCTCCTTTGGCTTCCCTAGAACTGTGAAGCATATCGATGCTGCTCTTTCTGAGGAAAACACTGGAAAAACCTACTTCTTTGTTGCTAACAAA I V S S F G F P R T V K H I D A A L S E E N T G K T V F F V A N K

1256

T A C T G G A G G T A T G A T G A A T A T A A A C G A T C T A T G G I T C C C A A A G T T G A T G C A G T T V W R V D E V K R S M D P S V P K M I A H D F P G I G H K V D A V

1355

TTCATGAAAGATGGATTTTTCTATTTCTTTCATGGAACAAGACAATACAAATTTGATCCTAAAACG~AGAGAATTTTGACTCTCCAG~AAGCTAATAGC F M K D G F F V F F H G T R Q V K F D P K T K R I L T L Q K A N S

1454

TGGTTCAACTGCAGGAAAAATTGAACATTACTAATTTGAATGGAAAACACATGGTGTGAGTCCAAAGAAGG~GTTTTCCTGAAGAACTGTCTATTTTCT W F N L R K N

1553

CAGTCATTTTTAACCTCTAGAGTCACTGATACACAGAATA~AATCTTATTTATACCTCAGTTTGCATATTTTTTTACTATTTAGAATGTAGCCCTTTTT

1652

GTACTGATATAATTTAGTTCCACAAATGGTGGGTACAAAAAGTCAAGTTTGTGGCTTATGGATTCATATAGGCCAGAGTTGCAAAGATCTTTTCCAGAG

1751

~~

~

TATGLAACTCTGACGTTGATCCCAGAGA~CAGCTTCAGTGACAAACATATCCTTTCAAGACAGAAAGAGACAGGAGA~ATGAGTCTTTGCCGGAGGAAA

1850

AGCAGCTCAAGAACACATGTGCAGTCACTGGTGTCACCCTAGATAGGCAAGGGATAACTCTTCTAACACAAAATAAGTGTTTTATGTTTGG-GT

1949

CAACCTTGTTTCTACTGTTTT

1970

ment with thePurNNATGNPur initiation consensus sequence (47). The stretch of 19 amino acids immediately following the initiating Met constitutes a typical hydrophobic core of the signal peptide (48). Although the precise position of the amino terminus of the mature protein is unknown, the hydropathicity plot (48), in combination with signal peptide cleavage patterns (49), allows one to predict that thecleavage of the signal peptide occurs after Ser at position 19. The mature collagenase proenzyme then has apredicted M, 51,929. When the proenzyme is subjected to limited digestion with trypsin, several inactive intermediates can be detected (27). The precise molecular structure of these intermediates remains to be determined. However, the completely activated enzyme is apparently the result of the removal of 81 amino acids from the amino terminusof the mature proenzyme. This activated form of collagenase has apredicted M, 42,570 which is in good agreement with the experimental value (18). The proenzyme contains 3 cysteine amino acids at positions 92, 272, and 466. The cysteine at position 92 is located 8 amino acids upstreamfrom the aminoterminus of the trypsinactivated enzyme and is, therefore, removed upon proteolytic activation of the collagenase. Two possible N-glycosylation sites (Asnl", Am1&) are contained within the trypsin activated M , 42,570 enzyme specie. Human Skin Collagenase is Related to an Unidentified Oncogene-inducible Rat Protein-The comparison of the pCol 185.2 cDNA sequence with the GenBank'R) nucleic acid sequences data base did not reveal any substantial homologies. A recently reported sequence of mRNA from rat skin fibroblasts (28) shares extensive homology with the coding sequence of the pCol 185.2 cDNA clone. The alignment of the protein sequences predicted from these clones is presented (Fig. 5 ) . The overall amino acid homology ofthe two proteins

MHSFPPLLLLLFWGVVSHSFPATLETQEQDVDLVQKYLEKYYNLKNDGRQ KGL V W CT--A S Y LHGSEEDAGKEVL N G EK VK

50

VEKRRNSCPVVEKLKQMQEFFGLKVTGKPDAETLKVMKQPRCGVPDVAQF FTKKD S K IQE K L M L SNMEL HX GG

100

VLTEGNPRWEQTHLRYRIENYTPDLPRADVDHAIEKAFQLWSNVTPLTFT STFP S K RKN IS V L ES S R LKVEE S

150

200 KVSEGQADIMISFVRCDHRDNSPFDCPGGNLAHAFQPGPGIGGDAHFDED

RI

E

AVEE G FI

MY

YA

TN

D

ERWTNNFREYNLHRVAAHELGHSLGLSHSTDIGALMYPSYTFSGDVQ---247 DDVTGT FL F F AWAE V KS LARFH T LAQDDIDCIQAIYCRSQNPVQP--------- IGPQTPKACDSKLTFDAIT S V SL PPTESPDVLVVPTKSNSLD E LPM S A S VS

288

TIRGEVMFFKDRFYMRTNPFYPEVELNFISVFWPQLPNGLEAAYEFADRD LL H F W KSLRT PGFYL S S SNMD VTN

338

EVRFFKGNKYUAVQGQNVLHGYPKDIYSSFGFPRTVKHIDAALSEENTGK 388

T FIL QI

IR BEE A

S HTL- L E

QK

I LKWK

TYFFVANKYWRYDEYKRSMDPSYPKMIAHDFPGIGHKVDAVFMKDGFFYF E D F F K Q EFRK EN T EAF L

438

FHGTRQYKFDPKTKRILTLQKANswFNcRKN S SS LENAGKVTHIL S "-

469

FIG. 5. Comparison of the amino acid sequences of human skin collagenase and the oncogene transformation-induced rat protein (28).The top line represents the amino acid sequence of human skin collagenase as predicted from the cDNA sequence (Fig. 4). The bottom line represents the predicted sequence of the homologous rat protein (28). Only the amino acids differing from human collagenase are shown at the corresponding positions. Four gaps are introduced as indicated to maximize the homology. A stop codon is encountered in the ratcDNA at a position corresponding to Arg"' in human collagenase.

6604

Human Fibroblast Collagenase Primary Structure

is 48%. The sequence proximal to the amino termini of both proteins is poorly conserved. The longest highly conserved region (positions 90-261) has 60.8%homology. The region between amino acids 261 and 288 is significantly divergent, and includes a 9 aminoacid insertion. The carboxy terminus proximal region of the proteins share 46.4% homology over a length of 181amino acids (positions 288-469). The ratprotein contains a single potential N-glycosylation site. This site is in alignment with one of the sites (AsnlZ0)in the collagenase protein. Three outof 4 cysteine residues in the rat protein are conserved in human collagenase. The comparison of nucleic acid sequences of these cDNA clones in the coding region (data not shown) is in good agreement with the alignment presented inFig. 5. The 5' untranslated regions are of similar size (58 bp rat, 68 bp human) and show no significant conservation of sequence. The 3' untranslated region of human collagenase mRNA (469 bp) is longer than thatof the rat(289 bp)and is significantly more divergent (39.5%homology) than thecoding region. The following additional observations on the structure of the ratprotein areof interest. The analysis of the hydropathicity index of the rat protein shows that its first 17 amino acids may constitute a hydrophobic core of the signal peptide, suggesting the possibility that theprotein may be secreted. The two-dimensional analysis of the secondary structures (50) of both the human and rat proteins displays considerable similarity, except that the area of the 9 amino acid insertion in the rat protein (Fig. 5) introduces two additional high probability p turns into thismolecule. The function of this oncogene-induced rat protein has not been identified. In spite of the extensive homology between these two proteins, the partial primary sequence of rat skin collagenase3 indicates thattherat cDNA clone does not correspond to thisenzyme. This observation does not exclude the possibility that the rat protein is analogous to another type of human collagenase. DISCUSSION

We have presented the primary structure of the cDNA clone pCol 185.2 representing the mRNA coding for human skin collagenase. The identification of the clone was based on its co-linearity with a single mRNA species coding for both the sequence of the peptide isolated from the purified collagenase preparation and thesequence of the amino termini of both the 47- and 42-kDa activated enzyme forms. The clone has coding capacity for all the other peptides isolated and sequenced from the purified collagenase preparation (Fig. 2). The sequence of the clone represents the full, or nearly full, sequence of collagenase mRNA, since the sizes of the primer extension products using both probes SO3 and SO6 are in good agreement with the distance of their sequences from the 5' end of the cDNA insert. The 68 bp of the 5' untranslated region isfollowedby the initiating Met codon. The open translation frame extends for 1,407 bp, coding for 469 amino acids of the preprocollagenase protein with M , 54,092. The analysis of the hydropathicity index (48),together with known signal peptide cleavage patterns, predicts that the first 19 amino acids constitute a typical signal peptide and allows US to postulate that the mature proenzyme protein begins with Phe at position 20. This proenzyme has apredicted M , 51,929. The following observations provide sufficient evidence to conclude that theappearance of the minor 57-kDa proenzyme species is the result of a partial post-translational glycosylation, probably through addition of the N-linked oligosaccharides. The 47- and 42-kDa products of the proteolytic activaJ. J. Jeffrey, personal communication.

tion of the 57- and 52-kDa procollagenase species have an identical amino-terminal sequence of 8 amino acids. The synthesis of the 57-kDa polypeptide is completely inhibited in the presence of tunicamycin. Treatment of an enzyme preparation, containing predominantly the 47-kDa activated collagenase, with endoglycosidase F leads to a quantitative conversion of this molecule to the 42-kDa enzyme species.l This is in good agreement with the fact that both potential N-glycosylation sites (Asn'", are downstream from the amino terminus of the proteolytically activated enzyme form. The procollagenase polypeptide contain 3cysteine residues (positions 92,278,466), one of which isremoved during proteolytic activation of the enzyme. These residues are likely to participate in the formation of the secondary structure of the enzyme, since both proenzyme and proteolytically activated forms show a lower apparent molecular weight on NaDodS04-PAGE in the absence of a reducing agent. In addition, the enzyme can be inactivated in the presence of dithiothreitol (data not shown).' Procollagenase can be activated by a variety of pathways (17, 26, 27). Although proteolysis results in the removal of 81 amino acids from the amino terminus of the molecule, other agents may activate the proenzyme without apparent loss of molecular weight (26). The role of the formation o f alternative disulfide bridges in theconformational changes accompanying this process can now be addressed. The activity of the fibroblast type interstitial collagenase is regulated a t a variety of levels including proenzyme activation (26, 27), interaction of active enzyme with specific inhibitors (51,52), andde nouo synthesis (53). Each of these mechanisms may exist in vivo and reflect a necessity for temporal regulation of enzyme activity of different amplitudes in response to alterations in the extracellular matrix. The level of the constitutive expression of the collagenase gene is tissue specific' and may reflect the rateo f collagen turnover in thesetissues. Phorbol ester treatment of bovine endothelial cells induces collagenase production andstimulates the cells to invade three-dimensional collagen lattices (54). This in vitro phenomenon may reflect the importantrole of collagenase during angiogenesis, when the invasion of perivascular extracellular matrix by endothelial cells isa crucial event. Failure to regulate the activity of collagenase may lead to analteration in the extracellular matrix and constitute a major defect in diseases such as recessive dystrophic epidermolysis bullosa, rheumatoid arthritis, and in otherphenomena such as tumor invasion, angiogenesis, and wound healing. A variety of macromolecules, including other types of collagenase, are involved in the mechanisms of interaction between the extracellular matrix and cell metabolic response. In that respect, the described homology between the oncogene induced rat protein (28) and human skin collagenase may provide an important clue in understanding these interactions. Finally the regulation and genomic arrangement of human skin collagenase, its evolutionarly relationship to othervertebrate collagenases andrelated members of this gene family, remains to be elucidated. Acknoubdgments-We thank S. Adams, M. Day, and R. Wiegand, Monsanto Corporation for synthesis of the probe SO3 and providing the protocol for the primer extension reaction. We also thank Dr. Richard Breathnach and his colleagues for providing us with their manuscript (Ref. 28) prior to publication, and Dr. George Stricklin and James Sacchettini for their assistance with peptide isolation for the amino acid sequence analysis. We gratefully acknowledge the excellent technical assistance of Keith Goldstein, Barry Marmer, and

Structure mnase Primary Human Fibroblast Collage

6605

26. Tyree, B., Seltzer, J. L., Halme, J., Jeffrey, J. J., and Eisen, A. Z. (1981) Arch. Biochem. Biophys.208,440-443 27. Stricklin, G. P., Jeffrey, J. J., Roswit, W. T., and Eisen, A. Z. (1983) Biochemistry 2 2 , 6 1 4 8 REFERENCES 28, Matrisian, L. M., Glaichenhaus, N., Gesnel, M-C., and Breath1. Roswit, W. T., Halme, J., and Jeffrey, J. J. (1983) Arch. Biochem. nach, R. (1985) EMBO (Eur. Mol. Biol. Organ.) 4,1435-1440 Biophys. 225,285-295 29. Cooper, T. W., Bauer, E. A., and Eisen, A. Z. (1982) Collagen Rel. 2. Vaes, G. (1972) Biochem. J. 1 2 6 , 275-289 Res. 3,205-216 3. Grillo, H. C., and Gross, J. (1967) Dev. Biol. 15,300-317 30. Sperling, R., Sperling, J., Levine, A. D., Spann, P., Stark, G. R., 4. Eisen, A. Z. (1969) J. Invest. Dermatol. 5 2 , 449-453 and Kornberg, R. D. (1985) Mol. Cell. Biol. 5,569-575 5. Bauer, E.A., and Eisen, A. Z. (1978) J. Exp. Med. 1 4 8 , 137831. Aviv, H., and Leder, P. (1972) Proc. Natl. Acad. Sci. U. S. A. 6 9 , 1387 1408-1412 Valle, K. J., Eisen, A. Z., and Bauer, E. A. 32. Grant, G. A., Henderson, K. O., Eisen, A. Z., and Bradshaw, R. 6. Kronberger, A"., (1982) J. Invest. Dermatol. 79,208-211 A, (1980) Biochemistry 19,4653-4659 7. Harris, E. D., Jr., Faulkner, C. S., and Brown F. E. (1975) Clin. 33. Grant, G.A., Sacchettini, J. C., and Welgus, H. G. (1983) BioOrthop. Relat. Res. 110,303-316 chemistry 22,354-358 8. Dayer, J-M., Russell, R. G., and Krane, S. M. (1977) Science 34. Aebersold, R., Teplow, D., Hood, L., and Kent, S. J. (1986) J. 195,181-183 Biol. Chem. 261,4229-4238 9. Brown, S. I., and Weller, C. A. (1971) Trans. Am. Acad. Ophthul- 35. Maxam, A., and Gilbert, W. (1977) Proc. Natl. Acad. Sci. U. S. A. mol. OtOlaryngOl. 74,375-382 74,560-564 10. Gordon. J.. Bauer. E. A.. and Eisen, A. Z. (1980) Arch. Ophthul- 36. Alexander, D. C.. McKnizht, T. D., and Williams, B. G. (1984) mol. 98,'341-345 Gene 3 i,79-89 11. Birkedal-Hansen. H. (1980) in Collapeme in Normal and Path- 37. Okavama. H.. and Berg. P. (1983) Mol. Cell. Biol. 3.280-289 ological Conneciive Tissu&. Woolley, D. E., and Evanson, J. J., 38. Ka&, M.', Kolter, R.,Thomas, C., Figurski, D., Meyer, R., Reed. pp. 128-134. J. Wiley and Sons, New York maut, E., and Helinski, D. R. (1979) Methods Enzymol. 6 8 , 12. Welgus, H.G., Jeffrey, J. J., and Eisen, A. Z. (1981) J. Biol. 268-280 Chem. 256,9511-9515 39. Hanahan, D., and Meselson, M. (1980) Gene 10,63-67 13. Salo, T., Liotta, L. A., and Trygmason, K. (1983) J. Biol. Chem. 40. Maniatis, T., Fritsch, E. F., and Sambrook, J. (1982) Molecular 258,3058-3063 Cloning: A Laboratory Manual pp. 202-203, Cold Spring Harbor 14. Liotta, L. A., Lanzer, W.L., and Garbisa, S. (1982) Biochem. Laboratory, Cold Spring Harbor, N.Y. Biophys. Res. Commun. 9 8 , 184-190 41. Thomas, P. S. (1980) Proc. Natl. Acad. Sci. U. S. A. 7 7 , 520115. Murphy, G., Reynolds, J. J., Bretz, U., andBaggioloini, M. (1982) 5205 Biochem. J. 203,209-221 42. Laemmli, U. K. (1970) Nature 277,680-685 Hibbs, M., Hasty, K. A., Seyer, J. M., Kang, A. H., and Mainardi, 16. 43. Bonner, W. M., and Lasky, R. A. (1974) Eur. J. Biochem. 4 6 , C. M. (1985) J. Biol. Chem. 260,2493-2500 83-88 17. Stricklin, G. P., Bauer, E. A., Jeffrey, J. J., and Eisen, A. Z. 44. Clark, S. D., Wilhelm, S. M., Stricklin, G. P., and Welgus, H. G . (1977) Biochemistry 16,1607-1615 (1985) Arch. Biochem. Biophys. 241,36-44 18. Stricklin, G. P., Eisen, A. Z., Bauer, E. A., and Jeffrey, J. J. 45. Nagase, H., Brinckerhoff, C. E., Vater, C. A., and Harris, E. D., (1978) Biochemistry 17,2331-2337 Jr. (1983) Biochem. J. 14,281-288 19. Evanson, J. M., Jeffrey, J. J., and Krane, S. M. (1968) J. Clin. 46. Gross, R. H., Sheldon, L.A., Fletcher, C. F., and Brinckerhoff, Invest. 4 7 , 2639-2681 C. E. (1984) Proc. Natl. Acad. Sci. U. S. A. 8 1 , 1981-1985 20. Wilhelm, S. M., Javed, T., and Miller, R. L. (1984) Collagen Rel. 47. Kozak, M. (1983) Microbiol. Rev. 47, 1-15 Res. 4,129-152 48. Kyte, J., and Doolittle, R. F. (1982) J. Mol. Biol. 1 5 7 , 105-132 21. Welgus, H. G., Campbell, E. J., Bar-Shavit, Z., Senior, R.M., 49. vonHeijne, G. (1983) Eur. J. Biochem. 1 3 3 , 17-21 and Teitelbaum, S. L. (1985) J. Clin. Invest. 76, 219-224 50. Chou, P. Y., and Fasman, G. D. (1978) Annu. Rev. Biochem. 47, 22. Hasty, K.A., Hibbs, M. S., Kang, A,H., and Mainardi, C. L. 251-276 (1984) J. Exp. Med. 159,1455-1463 51. Welgus, H. G., Stricklin, G. P., Eisen, A. Z., Bauer,E. A., Cooney, 23. Horwitz, A. L., Hance, A. J., and Crystal, R. G. (1977) Proc. SOC. R. V., and Jeffrey, J. J. (1979) J. Biol. Chem. 254,1938-1943 Natl. Acad. Sci. U. S. A. 74,897-901 52. Welrms, H. G.. and Stricklin. G. P. (1983) . . J. Biol. Chem. 258. 12k9-12264 24. Macartney, H.W., and Tschesche, H. (1983) Eur. J. Biochem. 130,71-78 53. Valle. K.-J.. andBauer. E. A. 11979) . , J. Biol. Chem. 254. 1011525. Seltzer, J. L., Adams, S. A., Grant, G. A., and Eisen, A. Z. (1981) 10i22 ' J. Biol. Chem. 256,46624668 54. Montessano, R., and Orci, L. (1985) Cell 4 2 , 469-477

William Roswit. We also thank Linda Merlotti and Rosemarie Brannan for their assistance in preparationof this manuscript.