Identification of an Alternatively Spliced Seprase mRNA That Encodes ...

4 downloads 0 Views 500KB Size Report
Stony Brook, New York 11794-8160. Seprase is a homodimeric ...... Kelly, T., Kechelava, S., Rozypal, T. L., West, K. W., and Korourian, S. (1998). Mod. Pathol.
THE JOURNAL OF BIOLOGICAL CHEMISTRY © 2000 by The American Society for Biochemistry and Molecular Biology, Inc.

Vol. 275, No. 4, Issue of January 28, pp. 2554 –2559, 2000 Printed in U.S.A.

Identification of an Alternatively Spliced Seprase mRNA That Encodes a Novel Intracellular Isoform* (Received for publication, August 23, 1999, and in revised form, October 22, 1999)

Leslie A. Goldstein and Wen-Tien Chen‡ From the Department of Medicine, Division of Medical Oncology, State University of New York, Stony Brook, New York 11794-8160

Seprase is a homodimeric 170-kDa integral membrane gelatinase that is related to the ectoenzyme dipeptidyl peptidase IV. We have identified an alternatively spliced seprase messenger from the human melanoma cell line LOX that encodes a novel truncated isoform, seprase-s. The splice variant mRNA is generated by an out-offrame deletion of a 1223-base pair exonic region that encodes part of the cytoplasmic tail, transmembrane, and the membrane proximal-central regions of the extracellular domain (Val5 through Ser412) of the seprase 97-kDa subunit (seprase-l). The seprase-s mRNA has an elongated 5ⴕ leader (548 nucleotides) that harbors at least two upstream open reading frames that inhibit seprase-s expression from a downstream major open reading frame. Deletion mutagenesis of the wild type splice variant cDNA confirms that initiation of the seprase-s coding sequence begins with an ATG codon that corresponds to Met522 of seprase-l. The seprase-s open reading frame encodes a 239-amino acid polypeptide with an Mr ⬃ 27,000 that precisely overlaps the carboxylterminal catalytic region of seprase-l.

Proteolytic degradation of the extracellular matrix is a fundamental property of normal tissue remodeling and repair as well as the pathological processes of tumor invasion and metastasis. In addition to the various families of proteolytic enzymes that serve as the major collagenases and gelatinases such as the matrix metalloproteases, etc. (1), a subfamily of membrane-bound nonclassical serine proteases, including seprase and dipeptidyl peptidase IV (CD26), are implicated in matrix degradation and invasiveness of migratory cells (2– 6). Seprase is a homodimeric 170-kDa integral membrane gelatinase whose expression appears to correlate with the levels of invasiveness manifested by the human melanoma cell line, LOX, in an in vitro extracellular matrix degradation/invasion assay (7). The deduced amino acid sequence of its 97-kDa subunit (seprase-l, GenBankTM accession number U76833) predicts a type II membrane topology with a short cytoplasmic tail (6 amino acids) followed by a transmembrane region (20 amino acids) and a large extracellular domain (734 amino acids (8)). Its catalytic triad of residues Ser624, Asp702, and His734 are * This work was supported by United States Public Health Service Grant R01 CA-39077 and the Susan G. Komen Breast Cancer Foundation. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The nucleotide sequence(s) reported in this paper has been submitted to the GenBankTM/EBI Data Bank with accession number(s) AF007822. ‡ To whom correspondence should be addressed: Dept. of Medicine, Division of Medical Oncology, HSC T-17, Rm. 080, State University of New York, Stony Brook, NY 11794-8160. Tel.: 516-444-6948; Fax: 516444-2493; E-mail: [email protected].

contained within a ⬃200-amino acid region located in the carboxyl terminus of each subunit. However, seprase requires the dimerization of its inactive subunits for activity (8, 9). Comparisons of their deduced amino acid sequences indicate that seprase is essentially identical to human fibroblast activation protein ␣ (FAP␣1; GenBankTM accession number U09278), which is expressed on reactive stromal fibroblasts of various carcinomas and on fibroblasts of healing wounds (10, 11). Additionally, seprase exhibits a striking sequence homology (52%) to the ectoenzyme dipeptidyl peptidase IV (GenBankTM accession number M74777), which increases to a 68% amino acid identity between their catalytic regions (8). Alternative RNA splicing allows for the diversification of the protein products of a single gene not only in terms of their structure but possibly their function and/or cellular localization. Interestingly, several genes that encode proteases associated with tumor invasion and metastasis undergo post-transcriptional RNA splicing. For example, splice variants with altered 5⬘- and/or 3⬘-untranslated regions have been reported for cathepsin B (12) and L (13). And there is a variant that encodes a truncated cytoplasmic isoform of cathepsin B (14). Transcription variants have also been identified that encode meprin ␤⬘ (15) and a soluble form of membrane type 3-matrix metalloprotease (16, 17). Also, the gene that encodes the murine homolog of FAP␣, mFAP (GenBankTM accession number Y10007), is reported to generate two splice variants that encode altered isoforms of the membrane-bound protease (18). Functional eukaryotic mRNAs that have one or more AUG codons within their 5⬘ leader sequences are relatively rare in nature (19, 20). Indeed, some proto-oncogenes, also genes that control cellular growth and differentiation, and viral genes give rise to mRNAs that possess one or more short upstream open reading frames (uORFs) or minicistrons in their 5⬘ leaders that do not overlap the downstream major ORF (20, 21). And there have been numerous reports that uORFs can function as cis-acting regulatory elements that significantly inhibit the expression of their cognate downstream major ORFs (22– 43). Here, we report the identification of an alternatively spliced seprase mRNA from LOX cells that is generated by the utilization of suboptimal exonic 5⬘ and 3⬘ splice sites in its pre-mRNA. The resulting messenger is polycistronic; it harbors at least two uORFs in its 5⬘ leader region that inhibit the expression from a downstream ORF of seprase-s, a truncated isoform of seprase that is identical to the catalytic region of seprase-l. 1 The abbreviations used are: FAP␣, fibroblast activation protein ␣; uORF, upstream open reading frame; seprase-s, seprase-short; seprase-l, seprase-long; HUVSMC, human umbilical vein smooth muscle cells; Ab, antibody; CAT, chloramphenicol acetyltransferase; RT-PCR, reverse transcription-polymerase chain reaction; nt, nucleotides; bp, base pair(s); kb, kilobase(s); PAGE, polyacrylamide gel electrophoresis.

2554

This paper is available on line at http://www.jbc.org

Molecular Cloning of a New Seprase Isoform EXPERIMENTAL PROCEDURES

Cells and Reagents—The human amelanotic melanoma cell line LOX was obtained from Dr. O. Fodstad, Institute for Cancer Research, The Norwegian Radium Hospital, Oslo, Norway. The human breast carcinoma cell line MDA-MB-436, the human melanotic melanoma cell line SKMEL28, the human embryonic lung fibroblast line WI-38, and the monkey kidney cell line COS-7 were all purchased from American Type Culture Collection. Human umbilical vein smooth muscle cells (HUVSMC) and total RNA were obtained from Dr. S. Steve Okada (Georgetown University). Total RNA from the melanotic melanoma cell line RPMI7951 was obtained from Dr. H. Nakahara (Georgetown University). Superscript II RNase H⫺ reverse transcriptase and recombinant Taq polymerase were from Life Technologies, Inc. Premixed deoxynucleotides were obtained from Roche Molecular Biochemicals. The mammalian expression vector pCR3.1 was purchased from Invitrogen, and the expression plasmid pCAT3-control vector was purchased from Promega. Unconjugated rabbit anti-CAT polyclonal Ab was purchased from 5 Prime 3 3 Prime, Inc. Alkaline phosphatase-conjugated antirabbit polyclonal Ab was from Rockland. Immun-Star chemiluminescent substrate was obtained from Bio-Rad. Amplify, Hyperfilm, and 3 L-[4,5- H]leucine (136 Ci/mmol) were obtained from Amersham Pharmacia Biotech. Immobilon polyvinylidene difluoride transfer membranes were from Millipore. Human glyceraldehyde-3-phosphate dehydrogenase amplimer set was from CLONTECH. RT-PCR—Isolation of total RNA followed by reverse transcription was carried out as described previously (9). Oligonucleotide primers were synthesized that correspond to the following nucleotide (nt) positions of the FAP␣ cDNA sequence: FAP 1 (5⬘- CCACGCTCTGAAGACAGAATT-3⬘ (nt 161–181; sense)); FAP 3 (5⬘-CCAGCAATGATAGCCTCAA-3⬘ (nt 1055–1073; sense)); FAP 4 (5⬘-ACAGACCTTACACTCTGAC-3⬘ (nt 1863–1845; antisense)); FAP 6 (5⬘-TCAGATTCTGATACAGGCT-3⬘ (nt 2526 –2508; antisense)); FAP 10 (5⬘-TAACACACTTCTTGCTTGGA-3⬘ (nt 1526 –1507; antisense)); FAP 11 (5⬘- TTACATCTATGACCTTAGCA-3⬘ (nt 598 – 617; sense)); FAP 12 (5⬘-AACACTGTGTCCAAAGCAA-3⬘ (nt 2734 –2716; antisense)); and FAP 13 (5⬘-GAAACTTGGCACGGTATTCAA-3⬘ (nt 45– 65; sense)). PCR was performed using Taq polymerase following the manufacturer’s instructions. Two ␮l of firststrand reaction that was preheated at 94 °C for 5 min and then quickchilled on ice was added last. A cycling profile of 94 °C for 30 s, 55 °C for 20 s, and 72 °C for 30 s (40 cycles) followed by a 15-min extension at 72 °C was routinely used. Samples were analyzed on 1% agarose gels. DNA Cloning—cDNA amplicons that encode seprase-s were obtained by RT-PCR utilizing either the primer pairs FAP 1 ⫹ FAP 6 or FAP 13 ⫹ FAP 12 from LOX, MDA-MB-436, or HUVSMC RNA. PCR was either carried out as described above or with the Expand Long Template PCR System (Roche Molecular Biochemicals) utilizing buffer 1, an annealing temperature of 55 °C, and an elongation time of 2.5 min for 30 cycles. Amplicons were isolated from either a 1% agarose gel using a QIAquick gel extraction kit, or PCR reactions were directly purified using QIAquick spin columns (Qiagen). Purified cDNAs were ligated into the pCR 3.1 vector. Ligation, transformation, and selection of recombinant clones were carried out using the eukaryotic TA cloning kit (Bidirectional; Invitrogen). DNA Sequencing and Analysis—The DNA sequence of seprase-s clones was obtained using the ABI prism dye terminator cycle sequencing kit and an ABI Prism 377 DNA sequencer (Perkin-Elmer). The cDNA insert of clone pA12 was sequenced on both strands using primers that generated overlapping sequence data. Primers utilized for the sense strand were: T7 (5⬘-TAATACGACTCACTATAGGG-3⬘ (vector)); FAP 8 (5⬘-TCCAAGCAAGAAGTGTGTTA-3⬘ (nt 1507–1526)); and FAP 5 (5⬘-TGACAAACTCCTCTATGCAG-3⬘ (nt 1951–1971)). Primers utilized for the antisense strand: RP-1 (5⬘-TAGAAGGCACAGTCGAGG-3⬘ (vector)) and FAP 7 (5⬘-CTGCATAGAGGAGTTTGTCA-3⬘ (nt 1970 – 1951)). Only the sense strand was sequenced for all other seprase-s clones. Sequence analysis was performed using Lasergene software (DNASTAR, Inc.) Competitive PCR—Determination of the relative levels of seprase-l and seprase-s mRNAs for the human melanoma cell line LOX was obtained by competitive PCR of seprase-l and seprase-s first-strand cDNAs generated from 3 LOX RNA preparations obtained over a 2-year period. First-strand cDNA synthesis was carried out as described above. Quantitation of seprase-l cDNA was obtained using a homologous competitive fragment that contains primer template sequences for the primers FAP 10 and 11 (see above). This fragment was produced by overlap extension using PCR (44). The oligonucleotide pair utilized to generate its 159-bp deletion (FAP␣ cDNA sequence from nt 908 through nt 1066) is FAP L (5⬘-GATACGGATATACCAGTTGCCTCAAGTGATT-

2555

ATTAT-3⬘ (sense)) and FAP M (5⬘-ATAATAATCACTTGAGGCAACTGGTATATCCGTATC-3⬘ (antisense)). The seprase-l target amplicon generated with the FAP 11⫹10 primers is 929 bp, whereas the mimic amplicon is 770 bp. Quantitation of seprase-s cDNA was carried out using a homologous DNA mimic that overlaps the truncated region of the seprase-s cDNA sequence and that contains primer template sequences for the primers FAP 1 and FAP 6 (see above). This competitive fragment was also generated by overlap extension using PCR (44). The oligonucleotide pair used to produce its 248-bp deletion (FAP␣ cDNA sequence from nt 1863 through nt 2110) is FAP N (5⬘-GTCAGAGTGTAAGGTCTGGCATCTGGAACTGGTCTT-3⬘ (sense)) and FAP O (5⬘-AAGACCAGTTCCAGATGCCAGACCTTACACTCTGAC-3⬘ (antisense)). The seprase-s target amplicon produced with the FAP 1⫹6 primers is 1143 bp, and its competitive mimic is 895 bp. PCR was carried out with Taq polymerase (see above) using the cycle profile described under “DNA Cloning.” A 25-␮l sample of each competitive PCR reaction was resolved on a 1.2% agarose, EtBr gel. The equivalence point for target and mimic amplicon intensities was determined visually. The initial endogenous levels of seprase-l and/or seprase-s cDNAs (in 1 ␮l of the first-strand cDNA reaction) for each RNA preparation represents the average value of input mimic DNA that generates the target-mimic equivalence point in three distinct titrations. The endogenous levels (average value, value range (attomole (1.0 ⫻ 10 ⫺18 mole)) of seprase-l and -s cDNAs synthesized from each of the 3 LOX RNA preparations are: preparation A, seprase-l 0.067, 0.050 – 0.075 and seprase-s 0.008, 0.006 – 0.010; preparation B, seprase-l 0.300, 0.300 and seprase-s 0.013, 0.010 – 0.016; and preparation C, seprase-l 0.135, 0.125– 0.150 and seprase-s 0.004, 0.003– 0.005. Deletion Mutagenesis—Deletion mutants of seprase-s cDNA were produced from the parental clone pA12 by overlap extension using PCR (44). The oligonucleotide pair for the deletion of the upstream ATG triplet (nt 1 to 3; p11⌬M-1) is FAP F (5⬘-TTATGGTACAAGATTCTTCCTCCTCAATTTGAC-3⬘ (sense)) and FAP G (5⬘-AGGAGGAAGAATCTTGTACCATAAAGTAATTTC-3⬘ (antisense)). For deletion of the downstream ATG triplet (nt 136 to 138; p24⌬M-2) the oligonucleotide pair is FAP H (5⬘-AGTAAGGAAGGGGTCATTGCCTTGGTGGAT-3⬘ (sense)) and FAP I (5⬘-CAAGGCAATGACCCCTTCCTTACTTGCAAG-3⬘ (antisense)). The double mutant p14⌬M-1⫹2 was generated from p24⌬M-2 with the FAP F ⫹ FAP G primer pair. PCR was carried out as described above using FAP 1 and/or FAP 6 as the flanking primer(s) for 40 cycles without the 72 °C extension. The same PCR parameters can be used for fusion amplification. Fusion amplicons were subcloned into the pCR 3.1 vector as described above. Gross deletion mutants of the seprase-s cDNA, which delete nt ⫺388 to ⫺266 (p8⫹6⫺3) and nt ⫺388 to ⫺62 (p16⫹6⫺11), were produced from pA12 by PCR using primer pairs FAP 8 (5⬘-TCCAAGCAAGAAGTGTGTTA-3⬘ (nt 1507–1526; sense)) and FAP 6 and FAP 16 (5⬘-CCAGCTGCCTAAAGAGGAAA-3⬘ (nt 1711–1730; sense)) and FAP 6, respectively. All other procedures for generating recombinant plasmids were as described above. All deletion mutations were verified by DNA sequence analysis. In Vitro Expression—Seprase-s cDNA and its deletion mutant homologs were expressed in vitro from plasmids (0.5 ␮g/25 ␮l) using both the TNT T7-coupled rabbit reticulocyte lysate and wheat germ extract systems (Promega). Plasmids were not linearized for the wheat germ extract system. Expression was also carried out in uncoupled in vitro transcription and translation. Amplicons (250 ng) containing the T7 promoter and seprase-s cDNA were transcribed using the T7 Cap-Scribe kit (Roche Molecular Biochemicals), and RNA transcripts (1 ␮l) were translated using wheat germ extract (Promega) adjusted to 73 mM K⫹ and 2.1 mM Mg2⫹. In vitro translations were carried out in the presense of [3H]leucine. The extent of [3H]leucine incorporation was determined by trichloroacetic acid precipitation on 5 ␮l of the reaction followed by liquid scintillation counting. The trichloroacetic acid precipitation value was actually the average value for a 5-␮l aliquot from 3 identical reactions. Translation products were resolved by SDS-PAGE on 12% gels. The gels were impregnated with Amplify and dried down before undergoing autoradiography. Fusion Protein Constructs—Utilizing overlap extension mutagenesis, a fusion protein construct, p14SC, was generated that linked the cDNA insert of pA12 with one that encodes CAT. This was accomplished using a primer pair that encodes the carboxyl-terminal residues (Cys234 to Asp239) of seprase-s and the amino-terminal residues (Ile5 to Thr10) of CAT. The primer pair had the following sequence: sepCAT-F (5⬘-TTCTCTTTGTCAGACATCACTGGATATACCACC-3⬘ (forward)) and sepCAT-R (5⬘-GGTATATCCAGTGATGTCTGACAAAGAGAAACA-3⬘ (reverse)). PCR was carried out as described under “Deletion Mutagenesis” using the primer pair FAP 1 and sepCAT-R with pA12 and sepCAT-F and CAT-R (5⬘-TGTATCTTATCATGTCTGCTC-3⬘ (nt 1210 –1190; pCA-

2556

Molecular Cloning of a New Seprase Isoform

T-3)) with the pCAT-3 control vector. Fusion-amplification using the Expand Long Template PCR System and subcloning were as described above. A deletion mutant, p33⌬M-1SC, in which an ATG triplet (nt 1 to 3) was deleted, was derived from p14SC using the primer pair FAP F ⫹ FAP G with FAP 1 ⫹ CAT-R as the flanking primers. Also, a CAT construct, pCAT, was produced by incorporating the CAT-coding region obtained by PCR using the primers CAT-F (5⬘-AGCTCTTAAGCGGCCGCAAGC-3⬘ (nt 451– 471; pCAT-3)) and CAT-R into the pCR 3.1 vector. DNA sequence analysis verified all CAT constructs. COS-7 Cell Transfection—Transient transfection of COS-7 cells was carried out by electroporation (0.3 kV; 950 microfarads). The electroporated cells were harvested after 72 h and lysed in a detergent extractbuffer (9). Immunoblotting—COS-7 detergent lysates (55 ␮g) were resolved by SDS-PAGE on 10% gels. Proteins were transferred to Immobilon polyvinylidene difluoride membranes. Blots were probed with a commercially available polyclonal rabbit anti-CAT Ab diluted 1:500. Primary Ab was detected with an anti-rabbit polyclonal Ab conjugated to alkaline phosphatase and diluted 1:20,000. Immunoreactive proteins were visualized using the Immun-Star substrate. Fig. 6 was obtained by exposing Hyperfilm to the Immun-Star-treated immunoblot for 15 s. This exposure time emphasizes or enhances the sepCAT fusion protein band relative to the CAT band. RESULTS

Reverse transcription-PCR of LOX RNA using the primers FAP 1 and FAP 6, which correspond to nucleotide sequences within the 5⬘- and 3⬘-untranslated regions, respectively, of the seprase mRNA(s), exhibits two major amplicons at ⬃2.4 kb and at ⬃1.2 kb (Fig. 1). The ⬃2.4-kb amplicon was previously shown (8) to contain the entire coding sequence for the seprase 97-kDa subunit (seprase-l). DNA sequence analysis (Fig. 2) of the clone pA12 (GenBankTM accession number AF007822) (cDNA insert contains the entire ⬃1.2-kb amplicon (1143 bp)) revealed a 1223-bp deletion of the region extending from nt 61 through nt 1283 of the seprase-l cDNA sequence. Otherwise, it is essentially identical to the reported seprase cDNA sequence. To confirm the existence of a truncated seprase mRNA that gives rise to the ⬃1.2-kb amplicon, we carried out RT-PCR on LOX RNA using primer pairs that generate nested fragments along the length of the seprase mRNA(s) (Fig. 1). Those pairs that correspond to nt sequences outside the predicted deleted region exhibit 2 major bands (i.e. seprase-l and -s mRNAs) with a size differential of ⬃1.2 kb. However, those pairs which utilize a primer that lies within the deleted region show only one band that corresponds in length to the full-length messenger (i.e. seprase-l mRNA). An additional low intensity intermediate size band was observed with all primer pairs that generate the two major amplicons (Fig. 1B). We have isolated and sequenced the intermediate band (⬃1 kb) produced by the FAP 1⫹4 primer pair; it is an artifact of PCR (data not shown). Also, three additional truncated cDNA clones obtained by RT-PCR of LOX RNA have been sequenced, and all exhibit precisely the same deletion region as pA12. We estimated the relative abundance of the seprase-l and seprase-s mRNAs in each of three LOX RNA preparations by utilizing competitive PCR of their first strand cDNAs (“Experimental Procedures”). The seprasel/seprase-s ratios for the 3 RNA preparations are 8.9, 22.7 (this preparation was used in Fig. 1), and 34.6 (this preparation was used in Fig. 3, lanes 1– 4). In addition, we found that both the seprase-l and -s mRNA levels appear to be fluctuating in each of the three RNA preparations. The preparation that generated the intermediate mRNA ratio of ⬃23 has the highest levels of both seprase-l and -s (“Experimental Procedures”). The existence of the truncated seprase mRNA is not unique to LOX cells. Reverse transcription-PCR analyses using the primer pair FAP 1⫹6 of RNAs from the cell lines RPMI7951 (melanoma), WI-38 (fibroblast) and MDA-MB-436 (carcinoma), and HUVSMC all exhibit amplicons corresponding to both the seprase-l and the seprase-s mRNAs (Fig. 3). The noninvasive

FIG. 1. Detection of an alternatively spliced seprase mRNA. A, reverse transcription-PCR was carried out using oligonucleotide primers (“Experimental Procedures”) that correspond to the FAP␣ cDNA sequence. The diagram shows the relative position of sense (FAP 13, 1, 11, 3; forward arrows) and antisense (FAP 10, 4, 6; reverse arrows) primers along the full-length seprase mRNA. The darkened area represents the exonic region (1223 bp) deleted in the seprase-s mRNA (Fig. 2). Vertical lines represent the relative positions of the common 5⬘ end, initiation codon for the seprase-l ORF (AUG(l)), initiation codon for the downstream seprase-s ORF (AUG(s)), termination codon for both ORFs (TAA), and the 3⬘ end, respectively, of the alternatively spliced seprase mRNAs. B, reverse transcription-PCR of LOX RNA utilizing primers in A. The primer pairs and the corresponding nt positions of their respective 5⬘ ends are FAP 13⫹4 (nt 45 and 1863), FAP 1⫹4 (nt 161 and 1863), FAP 11⫹4 (nt 598 and 1863), FAP 13⫹10 (nt 45 and 1526), FAP 11⫹10 (nt 598 and 1526), FAP 1⫹6 (nt 161 and 2526), and FAP 3⫹6 (nt 1055 and 2526). Primer pairs FAP 13⫹4, 1⫹4, 13⫹10, and 1⫹6 correspond to nt sequences outside the alternatively spliced region of the seprase mRNAs, whereas FAP 11 and FAP 3 are within it. The 1-kb minor band generated with the primer pair FAP 1⫹4 is an artifact of PCR (data not shown). Amplicons were resolved on a 1% agarose gel.

melanoma cell line SKMEL28, which does not express seprase, was negative for the presence of the seprase mRNAs (Fig. 3). Additionally, two truncated seprase cDNA clones were sequenced: one from the breast carcinoma line MDA-MB-436 and the other from HUVSMC. Both are essentially identical to pA12 (“Experimental Procedures”; data not shown) Analysis of the pA12 cDNA sequence (Fig. 2) predicts that the 1223-bp deletion between nt ⫺329 and ⫺328 is out of phase with respect to the seprase-l ORF, which begins 4 codons upstream at the ATG triplet represented by nt ⫺340 to ⫺338. The exonic deletion produces a uORF or minicistron encoding the pentapeptide MKTWQ followed by an in-frame TGA codon at nt ⫺325 to ⫺323 (distal uORF). Downstream, the cDNA sequence predicts the existence of a second uORF, which encompasses nt ⫺179 to ⫺72 (proximal uORF) and encodes a 36amino acid polypeptide that is not homologous to other reported uORF-encoded proteins (23, 25–29, 31, 34, 38, 43, 45– 47). A potential third uORF extends from nt ⫺326 through

Molecular Cloning of a New Seprase Isoform

2557

FIG. 3. Seprase mRNA profiles of melanoma, carcinoma, and fibroblast cell lines and HUVSMC. Reverse transcription-PCR was performed on total RNA from the human cell lines LOX (amelanotic melanoma), SKMEL28 (melanotic melanoma), RPMI7951 (melanotic melanoma), WI-38 (lung embryonic fibroblast) and MDA-MB-436 (breast carcinoma), and HUVSMC utilizing the seprase/FAP␣ primers FAP 1⫹6 (lanes 1, 3, 5, 7, 9, 11, and 13), which correspond to nucleotide sequences within the 5⬘- and 3⬘-untranslated regions, respectively, of the seprase mRNAs. Reverse transcription-PCR was also carried out using a glyceraldehyde-3-phosphate dehydrogenase amplimer set (lanes 2, 4, 6, 8, 10, 12, and 14). The ⬃2.4-kb and ⬃1.2-kb amplicons (indicated by arrows) generated in lanes 3, 7, 9, 11, and 13 correspond to seprase-l and seprase-s mRNAs, respectively. Lanes 1 and 2 represent RT-PCR of LOX RNA in the absence of reverse transcriptase. Lanes 5 and 6 represent RT-PCR of a noninvasive cell line (SKMEL28) that is negative for seprase expression. Lane 15 contains size markers. FIG. 2. Nucleotide sequence of the pA12 cDNA and the deduced amino acid sequences of its uORFs and its major ORF. Nucleotide and amino acid sequence numbers are to the left. The number 1 nucleotide and the first amino acid residue correspond to the major ORF. The deduced amino acid sequences of the distal (nt ⫺340 through ⫺326) and proximal (nt ⫺179 through ⫺72) uORFs are shown. Initiation and putative initiation ATG codons and the nucleotides in the ⫺3 and ⫹4 positions relative to these ATG codons (A ⫽ ⫹1) are represented by bold characters. The alternative splice junction between nt ⫺329 and ⫺328 is separated by 24 underlined nucleotides that represent the extreme 5⬘ (6 nucleotides) and 3⬘ (18 nucleotides) ends, respectively, of the deleted 1223-bp exonic region present in the full-length seprase cDNA. Putative exonic splicing enhancer-like motifs are represented by bold italicized characters. Initiation methionine residues are denoted by the bold character M, whereas amino acid residues in the seprase-s ORF that correspond to the catalytic triad (Ser103, Asp181, His213) and the serine protease consensus motif (Gly101, Trp102, Ser103, Tyr104, Gly105) of seprase-l are represented by bold underlined characters. Arrows (1) denote nt positions at which the uORF deletion mutants p8⫹6 –3 (nt ⫺265) and p16⫹6 –11 (nt ⫺61) begin their 5⬘ leader regions (Fig. 5).

⫺237 and would encode a 30-amino acid polypeptide. The initiation ATG triplet (nt ⫺326 to ⫺324) for this centrally located uORF is overlapped by the termination codon (nt ⫺325 to ⫺323) for the distal uORF (nt ⫺340 to ⫺326), and therefore it would be expected to initiate or reinitiate protein synthesis poorly (Ref. 48; see inhibition by uORFs below). Nevertheless, functional uORFs with this structural organization have been reported (31–33). The pA12 cDNA lacks 160 nt that are present at the 5⬘ end of the seprase/FAP␣ mRNA (10). We analyzed the 5⬘-untranslated region of the FAP␣ cDNA sequence for uORFs; none were found. The scanning model for initiation of protein synthesis (49) predicts that the first ATG triplet (nt 1 to 3) in adequate sequence context downstream of the proximal (nt ⫺179 to ⫺72) uORF can initiate polypeptide synthesis (Fig. 2). This ATG triplet corresponds to Met522 in full-length seprase-l (8). It thus delimits an ORF that encodes a polypeptide of 239 amino acids with a Mr 26,956 (seprase-s). To determine if this ATG codon initiates protein synthesis we carried out in vitro transcription and translation of the pA12 cDNA in both coupled and uncoupled systems using rabbit reticulocyte lysate and wheat germ extract (“Experimental Procedures”). In Fig. 4, lanes 2 and 7 show translation products generated by pA12 in the coupled rabbit reticulocyte and wheat germ systems, respectively. Both lanes exhibit a single major band under the 30-kDa marker.

FIG. 4. In vitro expression of seprase-s and its deletion mutant homologs. Parental plasmid pA12 that encodes seprase-s and the ATG codon deletion mutant constructs of its downstream ORF: p11⌬M-1 (nt 1 to 3), p24⌬M-2 (nt 136 to 138), and p14⌬M-1⫹2 (nt 1 to 3 and nt 136 to 138) were expressed in coupled in vitro transcription and translation systems (Fig. 2). Lanes 1 to 5 represent the rabbit reticulocyte lysate system, whereas lanes 6 to 9 utilize the wheat germ extract system. Plasmid pA11 is the vector control. The ⬃17-kDa translation product present in lanes 2–5 and lanes 7 (weak), 8, and 9 initiates from the ATG codon at nt 265 to 267 (Fig. 2; data not shown). Translation products were labeled by [3H]leucine incorporation and resolved by SDS-PAGE on a 12% gel followed by fluorography.

Uncoupled in vitro transcription followed by in vitro translation of capped RNA transcripts in wheat germ extract duplicated the results in lane 7 (“Experimental Procedures”; data not shown). To determine if the major band in lanes 2 and 7 initiates at the ATG codon corresponding to nt 1 to 3, we constructed a deletion mutant p11⌬M-1 (“Experimental Procedures”) in which nt 1 to 3 are deleted from the parental plasmid pA12. Lanes 3 and 8 show that indeed the major translation product generated by pA12 initiates at this ATG triplet. The next potential initiation ATG triplet is located at nt 136 to 138 (Fig. 2). In Fig. 4, lanes 4 and 9, a deletion mutant p24⌬M-2 in which nt 136 to 138 have been deleted expresses only the major translation product seen in lanes 2 and 7. This confirms that the upstream ATG codon (nt 1 to 3) is the primary site of initiation. In lane 5 we utilized a double mutant construct p14⌬M-1⫹2 in which both ATG triplets (nt 1 to 3 and nt 136 to 138) have been deleted. The translation products (between 21

2558

Molecular Cloning of a New Seprase Isoform

FIG. 5. Influence of uORFs on seprase-s expression. The effect of uORFs located within the 5⬘ leader region of the seprase splice variant on the expression of the seprase-s downstream ORF was determined. Equimolar amounts of parental plasmid pA12 (0.29 pmol/25 ␮l) and the deletion mutant constructs p8⫹6 –3 (0.29 pmol; proximal uORF at nt ⫺179 to ⫺72 remains intact; see Fig. 2) and p16⫹6 –11 (0.30 pmol; lacks all uORFs; see Fig. 2) were expressed in a coupled in vitro transcription and translation rabbit reticulocyte lysate system. Plasmid pA11 (0.30 pmol) is the vector control. Translation products were labeled by [3H]leucine incorporation. An equal volume (2 ␮l) of each coupled reaction was resolved by SDS-PAGE on a 10% gel followed by fluorography.

and 30 kDa) generated by wild type pA12 (lane 2) and the two deletion mutants p11⌬M-1 (lane 3) and p24⌬M-2 (lane 4) are not detected. The same result was obtained with wheat germ extract (data not shown). These results indicate that the downstream ATG codon at nt 136 to 138 is initiation-capable (p11⌬M-1 initiates at this site), but it is the upstream ATG triplet (nt 1 to 3) that serves as the major seprase-s initiation site. The pA12 cDNA sequence predicts that there are two unambiguous uORFs located at nt ⫺340 to ⫺326 (distal) and at nt ⫺179 to ⫺72 (proximal) and a putative third uORF (central), which extends from nt ⫺326 to ⫺237. We analyzed the contribution the uORFs made to the translational efficiency of the downstream ORF by generating 5⬘ leader deletion mutants: p8⫹6 –3, which lacks the distal uORF and a large portion (nt ⫺326 through ⫺266) of the putative central uORF, and p16⫹6 –11, which deletes all uORFs (“Experimental Procedures”; Fig. 2). As can be seen in Fig. 5, there is a marked difference in the translation levels of the downstream ORF between the wild type pA12 and the deletion mutant p16⫹6 – 11. Removal of the distal uORF and a majority (76%) of the potential-central uORF shows a small but noticeable increase in downstream ORF expression. Quantitation of [3H]leucine incorporation revealed that p8⫹6 –3 increased [3H]leucine incorporation by 1.3-fold and p16⫹6 –11 by 3.7-fold over parental pA12 (“Experimental Procedures”). Our panel of anti-seprase monoclonal Abs (9) does not recognize the pA12 primary translation product generated in the coupled in vitro transcription and translation system. To confirm that the downstream ORF of pA12 is expressed in vivo, we carried out transient transfection of COS-7 cells with a fusion protein construct, p14SC, that links the cDNA insert of pA12 to one that encodes CAT (“Experimental Procedures”). We also made a deletion mutant of this construct, p33⌬M-1SC, which deletes the initiation ATG triplet (nt 1 to 3) of the downstream ORF. The CAT ORF, which begins at Ile5, is in the same reading frame as seprase-s. Thus, detection of the fusion protein (sepCAT) by an anti-CAT polyclonal Ab can only occur if seprase-s is expressed. Also, neither fusion protein construct can express native full-length CAT, since the initiation ATG triplet for the CAT ORF (and the three succeeding codons) is deleted. Fig. 6 shows the results of an immunoblot that was carried out on detergent extracts of transiently transfected

FIG. 6. In vivo expression of seprase-s. A fusion protein construct, p14SC, that links the cDNA insert of plasmid pA12 (encodes seprase-s from its downstream ORF) to one that encodes CAT and a p14SC deletion mutant construct, p33⌬M-1SC, in which the initiation ATG codon (nt 1 to 3) of the downstream fusion protein (sepCAT) ORF is deleted, and two additional plasmids, pCAT, which encodes native CAT, and pA11, the vector control, were transiently transfected into COS-7 cells. Detergent extracts (⬃55 ␮g) of the transfected COS-7 cells were resolved by SDS-PAGE on a 10% gel, transferred to an Immobilon polyvinylidene difluoride membrane, and immunoblotted with a polyclonal anti-CAT Ab. Arrows indicate the positions of the CAT and sepCAT bands in their respective lanes.

COS-7 cells using the positive control construct pCAT (encodes full-length CAT), p14SC, p33⌬M-1, and pA11 (vector control). The p14SC lane shows the expression of the sepCAT fusion protein band co-migrating with the 46-kDa marker (predicted Mr ⬃ 53-kDa) that is not present in the other lanes. DISCUSSION

Our results confirm the existence of an alternatively spliced seprase mRNA that encodes a novel truncated isoform, seprase-s. The pA12 cDNA sequence (Fig. 2) indicates that deletion of an exonic 1223-bp region (it encodes part of the cytoplasmic tail, transmembrane, membrane proximal (Nglycosylation), and the central (cysteine-rich) regions of the extracellular domain of seprase-l (8)) from the seprase pre-mRNA is the result of alternative exon splicing, which obeys the GT-AG rule (50) but utilizes suboptimal exonic donor and acceptor splice sites (51, 52). Interestingly, within the nucleotide sequence just downstream of the splice junction (Fig. 2; nt ⫺329 to ⫺328) is a purine-rich region (nt ⫺318 to ⫺297) 5⬘-GAAGAATACCCTGGAAGAAGAAA-3⬘, which resembles exonic splicing enhancer motifs that facilitate the removal of proximal upstream introns (with weak 5⬘ and/or 3⬘ splice sites) from pre-mRNA (53–57). And based on the genomic organization of the human FAP gene (58), the splice sites utilized in the alternative splicing of the seprase pre-mRNA are located within exons 2 and 15, respectively. Alternative processing of human dipeptidyl peptidase IV pre-mRNA has not been reported (59, 60). Determination of the relative abundance of the seprase-l and -s mRNAs in LOX cells (“Results”) suggests that the seprase mRNA ratio is in a dynamic state and that a complex set of cellular factors (SR proteins, transcription factors etc.) affect the relative abundance of the seprase mRNAs. Deletion mutagenesis of the pA12 cDNA 5⬘ leader sequence (Fig. 5) confirms that this uORF(s)-containing region inhibits the translation of the seprase-s downstream ORF. The 5⬘ leader (388 nt) of the pA12 cDNA as well as the projected 5⬘ leader (548 nt) of its cognate messenger has a G⫹C content of ⬃40%. This observation suggests that it is the uORFs and not extensive secondary structure of the 5⬘ leader that inhibit the expression of seprase-s. Because the initiation ATG codons for the distal (nt ⫺340 to ⫺326) and proximal (nt ⫺179 to ⫺72) uORFs have an adequate sequence context (49), the proximal uORF also possesses an in-frame ATG triplet (nt ⫺131 to ⫺129) in a strong context (A at ⫺134; G at ⫺128), it is probable

Molecular Cloning of a New Seprase Isoform that 40 S subunits reaching the downstream ORF must translate at least one or both of these uORFs. This strongly suggests that the expression of the downstream coding sequence (seprase-s) involves 40 S subunit reinitiation, which is consistent with the decreased translational efficiency exhibited by the pA12 transcripts (49, 61). In vitro transcription and translation coupled with deletion mutagenesis of clone pA12 (Fig. 4) confirms that the seprase-s ORF encodes only the carboxyl-terminal region (Fig. 2) of the integral membrane isoform, seprase-l. And it is the carboxylterminal region of seprase-l that is responsible for its proteolytic activity (8, 9). However, whether seprase-s retains proteolytic activity is still not known. Dimerization of seprase-l (9) and dipeptidyl peptidase IV (62) monomeric subunits is required for their proteolytic activity. Additionally, it has been reported that the dimerization of dipeptidyl peptidase IV subunits occurs in the Golgi apparatus (63). Analysis of the seprase-s sequence by the PSORT II program for prediction of protein subcellular localization sites (Swiss Institute of Bioinformatics) indicates a 70% probability that seprase-s is a cytoplasmic protein with no ability to insert itself into an organelle or plasma membrane or be targeted to an organelle. Recently, it was reported that invasive ductal carcinoma cells of human breast cancers exhibit polyclonal Ab staining against seprase not only on the cell surface but also throughout the cytoplasm (4). Whether the cytoplasmic staining of vesicle-associated intracellular seprase in these cells was in some part due to seprase-s remains to be determined. Clearly, the structure of seprase-s dictates that its role in the biology of seprase is going to be quite different from its integral membrane counterpart, seprase-l. Acknowledgments— We acknowledge Dr. Giulio Ghersi for his helpful discussions and assistance throughout the course of this work. We are also grateful to Dr. Jaw-Yuan Wang for his assistance in the preparation of this manuscript and Yunyun Yeh for her excellent technical assistance with cell cultures. REFERENCES 1. Chen, W.-T. (1996) Enzyme Protein 49, 59 –71 2. Aoyama, A., and Chen, W.-T. (1990) Proc. Natl. Acad. Sci. U. S. A. 87, 8296 – 8300 3. Kelly, T., Mueller, S. C., Yeh, Y., and Chen, W.-T. (1994) J. Cell. Physiol. 158, 299 –308 4. Kelly, T., Kechelava, S., Rozypal, T. L., West, K. W., and Korourian, S. (1998) Mod. Pathol. 11, 855– 863 5. Van den Oord, J. J. (1998) Br. J. Dermatol. 138, 615– 621 6. Bermpohl, F., Lo¨ster, K., Reutter, W., and Baum, O. (1998) FEBS Lett. 428, 152–156 7. Monsky, W. L., Lin, C.-Y., Aoyama, A., Kelly, T., Mueller, S. C., Akiyama, S. K., and Chen, W.-T. (1994) Cancer Res. 54, 5702–5710 8. Goldstein, L. A., Ghersi, G., Pin˜eiro-Sa´nchez, M. L., Salamone, M., Yeh, Y. Y., Flessate, D., and Chen, W.-T. (1997) Biochim. Biophys. Acta 1361, 11–19 9. Pineiro-Sanchez, M. L., Goldstein, L. A., Dodt, J., Howard, L., Yeh, Y., Tran, H., Argraves, W. S., and Chen, W.-T. (1997) J. Biol. Chem. 272, 7595–7601; Correction (1998) J. Biol. Chem. 273, 13366 10. Scanlan, M. J., Raj, B. K., Calvo, B., Garin-Chesa, P., Sanz-Moncasi, M. P., Healey, J. H., Old, L. J., and Rettig, W. J. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 5657–5661 11. Garin-Chesa, P., Old, L. J., and Rettig, W. J. (1990) Proc. Natl. Acad. Sci. U. S. A. 87, 7235–7239 12. Berquin, I. M., and Sloane, B. F. (1996) in Intracellular Protein Catabolism (Suzuki, K., and Bond, J., eds) pp. 281–294, Plenum Publishing Corp., New York 13. Rescheleit, D. K., Rommerskirch, W. J., and Wiederanders, B. (1996) FEBS Lett. 394, 345–348

2559

14. Mehtani, S., Gong, Q., Panella, J., Subbiah, S., Peffley, D. M., and Frankfater, A. (1998) J. Biol. Chem. 273, 13236 –13244 15. Dietrich, J. M., Jiang, W., and Bond, J. S. (1996) J. Biol. Chem. 271, 2271–2278 16. Matsumoto, S. I., Katoh, M., Saito, S., Watanabe, T., and Masuho, Y. (1997) Biochim. Biophys. Acta 1354, 159 –170 17. Shofuda, K., Yasumitsu, H., Nishihashi, A., Miki, K., and Miyazaki, K. (1997) J. Biol. Chem. 272, 9749 –9754 18. Niedermeyer, J., Scanlan, M. J., Garin-Chesa, P., Daiber, C., Fiebig, H. H., Old, L. J., and Rettig, W. J. (1997) Int. J. Cancer 71, 383–389 19. Kozak, M. (1987) Nucleic Acids Res. 15, 8125– 8148 20. Kozak, M. (1991) J. Cell Biol. 115, 887–903 21. Kozak, M. (1986) Cell 47, 481– 483 22. Werner, M., Feller, A., Messenguy, F., and Pierard, A. (1987) Cell 49, 805– 813 23. Delbecq, P., Werner, M., Feller, A., Filipkowski, R. K., Messenguy, F., and Pierard, A. (1994) Mol. Cell. Biol. 14, 2378 –2390 24. Abastado, J. P., Miller, P. F., Jackson, B. M., and Hinnebusch, A. G. (1991) Mol. Cell. Biol. 11, 486 – 496 25. Bergenhem, N. C., Venta, P. J., Hopkins, P. J., Kim, H. J., and Tashian, R. E. (1992) Proc. Natl. Acad. Sci. U. S. A. 89, 8798 – 8802 26. Damiani, R. D. J., and Wessler, S. R. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 8244 – 8248 27. Han, S., Navarro, J., Greve, R. A., and Adams, T. H. (1993) EMBO J. 12, 2449 –2457 28. Hofmann, M. A., Senanayake, S. D., and Brian, D. A. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 11733–11737 29. Parola, A. L., and Kobilka, B. K. (1994) J. Biol. Chem. 269, 4497– 4505 30. McGraw, D. W., Forbes, S. L., Kramer, L. A., and Liggett, S. B. (1998) J. Clin. Invest. 102, 1927–1932 31. Zimmer, A., Zimmer, A. M., and Reynolds, K. (1994) J. Cell Biol. 127, 1111–1119 32. Reynolds, K., Zimmer, A. M., and Zimmer, A. (1996) J. Cell Biol. 134, 827– 835 33. Cao, J., and Geballe, A. P. (1995) J. Virol. 69, 1030 –1036 34. Cao, J., and Geballe, A. P. (1996) Mol. Cell. Biol. 16, 7109 –7114 35. Garnier, G., Circolo, A., and Colten, H. R. (1995) J. Immunol. 154, 3275–3282 36. Donze, O., Damay, P., and Spahr, P. F. (1995) Nucleic Acids Res. 23, 861– 868 37. Harigai, M., Miyashita, T., Hanada, M., and Reed, J. C. (1996) Oncogene 12, 1369 –1374 38. Luo, Z., and Sachs, M. S. (1996) J. Bacteriol. 178, 2172–2177 39. Wang, Z., and Sachs, M. S. (1997) Mol. Cell. Biol. 17, 4904 – 4913 40. Ruan, H., Shantz, L. M., Pegg, A. E., and Morris, D. R. (1996) J. Biol. Chem. 271, 29576 –29582 41. Mize, G. J., Ruan, H., Low, J. J., and Morris, D. R. (1998) J. Biol. Chem. 273, 32500 –32505 42. Bergamini, G., Reschke, M., Battista, M. C., Boccuni, M. C., Campanini, F., Ripalti, A., and Landini, M. P. (1998) J. Virol. 72, 8425– 8429 43. Lincoln, A. J., Monczak, Y., Williams, S. C., and Johnson, P. F. (1998) J. Biol. Chem. 273, 9552–9560 44. Ho, S. N., Hunt, H. D., Horton, R. M., Pullen, J. K., and Pease, L. R. (1989) Gene 77, 51–59 45. Donze, O., and Spahr, P. F. (1992) EMBO J. 11, 3747–3757 46. Hill, J. R., and Morris, D. R. (1992) J. Biol. Chem. 267, 21886 –21893 47. Mori, Y., Matsubara, H., Murasawa, S., Kijima, K., Maruyama, K., Tsukaguchi, H., Okubo, N., Hamakubo, T., Inagami, T., Iwasaka, T., and Inada, M. (1996) Hypertension 28, 810 – 817 48. Kozak, M. (1987) Mol. Cell. Biol. 7, 3438 –3445 49. Kozak, M. (1989) J. Cell Biol. 108, 229 –241 50. Breathnach, R., Benoist, C., O’Hare, K., Gannon, F., and Chambon, P. (1978) Proc. Natl. Acad. Sci. U. S. A. 75, 4853– 4857 51. Mount, S. M. (1982) Nucleic Acids Res. 10, 459 – 472 52. Ohshima, Y., and Gotoh, Y. (1987) J. Mol. Biol. 195, 247–259 53. Xu, R., Teng, J., and Cooper, T. A. (1993) Mol. Cell. Biol. 13, 3660 –3674 54. Watakabe, A., Tanaka, K., and Shimura, Y. (1993) Genes Dev. 7, 407– 418 55. Ramchatesingh, J., Zahler, A. M., Neugebauer, K. M., Roth, M. B., and Cooper, T. A. (1995) Mol. Cell. Biol. 15, 4898 – 4907 56. Dirksen, W. P., Hampson, R. K., Sun, Q., and Rottman, F. M. (1994) J. Biol. Chem. 269, 6431– 6436 57. Yeakley, J. M., Morfin, J. P., Rosenfeld, M. G., and Fu, X. D. (1996) Proc. Natl. Acad. Sci. U. S. A. 93, 7582–7587 58. Niedermeyer, J., Enenkel, B., Park, J. E., Lenter, M., Rettig, W. J., Damm, K., and Schnapp, A. (1998) Eur. J. Biochem. 254, 650 – 654 59. Abbott, C. A., Baker, E., Sutherland, G. R., and McCaughan, G. W. (1994) Immunogenetics 40, 331–338 60. Abbott, C. A., Baker, E., Sutherland, G. R., and McCaughan G. W. (1995) Immunogenetics 42, 76 61. Kozak, M. (1994) Biochimie (Paris) 76, 815– 821 62. Walborg, E. F. J., Tsuchida, S., Weeden, D. S., Thomas, M. W., Barrick, A., McEntire, K. D., Allison, J. P., and Hixson, D. C. (1985) Exp. Cell Res. 158, 509 –518 63. Jascur, T., Matter, K., and Hauri, H. P. (1991) Biochemistry 30, 1908 –1915