A Purine-rich Exon Sequence Enhances Alternative Splicing of Bovine ...

8 downloads 0 Views 4MB Size Report
These results support a model for alternative intron retention in which purine-rich sequences function as part of an “exonic splicing enhancer“ to complement a.
THEJOURNAL OF BIOLOGICAL CHEMISTRY 0 1994 by The American Society for Biochemistry and Molecular Biology, Inc.

Vol. 269,No. 9, Issue of March 4, pp. 6431-6436, 1994 Printed in U.S.A.

A Purine-rich Exon Sequence Enhances Alternative Splicing of Bovine Growth Hormone Pre-mRNA* (Received forpublication, September 16, 1993)

Wessel P. Dirksen, RobertK. Hampson, Qiang Sun, and FritzM. RottmanS From the Department of Molecular Biology and Microbiology, Case Western Reserve University, School of Medicine, Cleveland, Ohio 44106-4960

A previous study has demonstrated that deletion of aa small fraction of the cytosolic mRNA (11).An open reading exon, region within the last exonof bovine growth hormone frame is maintained through the intron and into the last (bGH) pre-mRNAresults in almost complete retention leading of to production of a bGH-related protein2differing from R K., LaFollette, L., and normal bGH at the carboxyl terminus. The humangrowth horthe upstream intron (Hampson, Rottman, E M. (1989)Mol. Cell. BioZ. 9, 1604-1610). We mone V gene, which contains an open reading frame in intron now demonstrate that insertion of a simple purine-richD with a high degree of sequence similarity to bGH, also retains element (GGAAG), which is present within the deleted intron D in a fraction of the human growth hormone V mRNA region,activatesintronsplicinguponexpressionin in the placenta (141, suggesting a possible physiological role for transfectedcells. Moreover, severalrepeats of the the protein resultingfrom intron retention. GGAA(G) sequence restore splicing to near wild-type In previous studies ( E ) , we identified a 115-nucleotide FspIlevels and direct the binding of a factor present HeLa in PuuII region (FPsequence) within the last exon (exon 5) of bGH cell nuclear extracts. Mutation of the 5-splice site topre-mRNA that is required for efficient splicing of intron D ward U1 small nuclear RNA complementarity eliminates uponexpression in transfected cells. Interestingly, this 115dependence on the downstream exon sequence for splicfragment enhances splicing in either orientation, ing. These results support a model for alternative nucleotide intron retention in which purine-rich sequences function as which led us to focus on a 10-base pair palindromic sequence part of an “exonic splicingenhancer“ to complement a (CTTCCGGAAG) present within the FP sequence. Furthermore, when bovine prolactin (bPRL) intronD is placed immeweak 5‘-splicesite and thereby facilitate intron removal. As a result, the majority ofbGH mRNA is processed to diately upstreamof bGH exon 5, splicing occurs irrespectiveof sequence, suggesting thata remove intron D while still allowing a fractionof bGH the presence of the downstream FP specific component(s) in bGH intron D necessitates the presmRNA containing the intact intron to reach the cytoence of the FP sequence (15). plasm. Alternative pre-mRNA splicing is an important mechanism in gene regulation and can include the use of alternative 5’and/or 3“splice sites, exon skipping, mutual exon exclusion, (1-4).Although intron retentionis comand/or intron retention mon in viralpre-mRNAs, it appearsto be a relatively rare form of alternative splicing in vertebrates (5). It is believed that intron-containing mRNAs normally are prevented from being transportedtothe cytoplasm duetotheformation of the spliceosome complex, whichcommits the pre-mRNA to the splicing pathway (1, 6-8). However, there are examplesof alternative intron retention in vertebrates that result in novel proteins due to translation of the intron sequences (9-12). Therefore, there must be a mechanism by which a fraction of the mRNA molecules, containing an intact intron, escape the splicing pathway. Because both the 5’- and 3”splice sites are required for spliceosome commitment complex formation (131, it ispossible that suboptimalsplice sites may preventcomplex formation, thereby allowing introns to be retained and introncontaining mRNAs to be transported to the cytoplasm. Bovine growth hormone (bGH)’ pre-mRNA undergoes alterD) is retained in native splicing in which the last intron (intron

* This work was supported by United States Public Health Service Grant DK32770 (to F. M. R.) fromthe National Institutes of Health. The costs of publication of this article were defrayedin part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accordance with 18 U.S.C. Section 1734 solelyto indicate this fact. $ Towhom correspondence should be addressed. Tel.: 216-368-3420; Fax: 216-368-3055. The abbreviationsused are: bGH, bovine growth hormone; FP, FspIPvuII; bPRL, bovine prolactin; CHO, Chinese hamster ovary; snRNA,

In this report,we identify the purine-rich portion of the 10base pair palindrome aascritical sequence required for stimulation of intron D splicing and show that a protein present in nuclear extracts binds specifically to multiple copies of this sequence. We show that a suboptimal 5”splice site leads to the requirement for the FP sequence in efficient splicing. This simple purine sequence (GGAAG), perhaps inconjunction with other purine sequences found in exon 5, compensates for the weak 5”splice site in intronD while allowing a fraction of the bGH mRNA to retain intronD. MATERIALSANDMETHODS DNA nansfection of CHO Cells and S I Nuclease Mapping ofRNACHOcellsweremaintained in Dulbecco’s modified Eagle’smedium supplemented with 10%fetal bovine serum, nonessential amino acids, and antibiotics (penicillin and streptomycin). Cells were transfected using Lipofectin (Life Technologies, Inc.) under the conditions recomRNA mended bythe manufacturer. Cells were harvested, and polysomal was prepared 48 h after transfection as described (15). Polyadenylated RNA was prepared by oligo(dT)-cellulose chromatography(16). Probes for S1 nuclease protection experiments were 3‘-end-labeled S1 nuclease analysis of with T4 DNA polymerase and [CY-~~PI~CTP. poly(A)+RNA was camed out as described earlier (15). Scanning of SI nuclease-protected fragmentswas carried out usinga PhosphorImager (Molecular DynamicsInc.), andthe resulting peaks were integratedby area. RNA Synthesis and W Cross-linking-Capped RNAs were synthesized from vectors E5/FP, E5/FPD (where FPDis FspI-PvuII deletion), E5/Pal, E5/G7&, and E5/C7T6as described previously using T3 and T7 RNApolymerases and[CY-~~PIGTP to labelthe RNAs (17). These labeled RNAs were then incubated in HeLa cell nuclear extract and UV-crosssmall nuclear RNA ESE, exonic splicing enhancer. * R. K. Hampson, unpublished data.

643 1

6432

Purine-rich Sequences Compensate for a Weak 5’Splice Site

linked as previously described (17), with the exception that the reac- bGH mRNA (151.However, this palindrome is not theexclusive tions were treated with a combination of RNases A and T1after cross- activating elementcontained in theFP sequence because delelinking. tion of just this 10-nucleotide sequence resulted inonly partial Plasmid Construction-The parent expressionplasmidpSVB3Ba contains the entire structural region of the bGH gene driven by the diminution of splicing relative to thatof the entireFP deletion. SV40 late promoter (18). Plasmid derivativesof pSVB3Ba, pFPD (pre- Only when other sequences within the FP element that reintron D retention viously AB), pPal (previously ABPal), pPrl-GHE5, and pPrl-GHE5/AB semble the palindrome were also deleted did are described elsewhere (15). Plasmids pPalGGCC, pTriPal’, and pTri- approach the levels observed with pFPD (15). Therefore, we Pal- were constructed by restriction digestion of pPal with BspEI and sought to define the critical component(s) of this palindromic end-filling with Klenow fragment, followed by insertion of either no sequence responsible for the positive influence on splicing of oligonucleotide or annealed oligonucleotides PAL1 (5”AAGCGTCTCintron D. GTGCAGCTTCCGGAAGAATGCTG‘ITAACTT-3’)and PAL2 (5’-AAGTTo determine whether the palindromicsequence must be TAACAGCATTC’ITCCGGAAGCTGCACGAGACGCTT-3’) in each orientation, respectively (see Fig. 1B). Plasmids pU1(4/9) and pU1(8/9) uninterrupted for splicing stimulation tooccur, the palindrome were constructed by subcloning polymerase chain reaction-amplified was disrupted by insertion of GGCC in the center of this sefragments of bGH between a n h l l l I site inexon 4 and the PuuII site quence (CTTCCGGCCGGAAG; see “Materials and Methods”). in exon 5 using oligonucleotide 1(5’-TGC’ITGCGGGCATG’IlTGTG-3‘) Transient expression revealed that this extended palindrome for pUl(419) and oligonucleotide 2 (5’-ACTTTCCTGGCATGTlTGTG3‘) for pU1(8/9) as downstream primers and oligonucleotide 223 (5’- gives rise to splicing levels nearly identical to those observed CCTCGGACCGTGTCTATGAG-3’)as the upstreamprimer (see Fig. 2). with the wild-type palindromic sequence (Fig. U3, lanes 3 and Plasmids pPalGA, pC3T,, pG3A,, pG4A3, pG& pG7&, and pC7T6were 41, suggesting that the precise sequence of the entire palinconstructed i n a n identical manner, using oligonucleotide 5 (5’-TC- drome is not critical. There is another sequence within the CTTCCGGAAGGCATGl”GTG-3’), oligonucleotide 10 (5”GGAAG- 115-nucleotide FP element that is identicalto the palindrome GCATGmGTG-3’1, oligonucleotide 12 (5’-CTTCCGCATGT”GTGat 8 out of 10 residues. Therefore, we reasoned that multiple oligonucleo3’), oligonucleotide 6 (5‘-TCC’lTCCGCATGT”GTG-3’), copies of the activating sequence may be necessary to obtain tide 13 (5’-CTTCCTTCCGCATGl”GTG-3’), oligonucleotide 8 (5‘CC’ITCCTTCCTTCCGCATG”TGTG-3‘), and oligonucleotide 11 (5’- wild-type levels of splicing. To test this hypothesis, a DNA fragment containing three copies of the 10-base pair palinGAAGGAAGGAAGGGCATGlGTG-3‘), respectively, theas downstream primers (see Table I), with the exception that a BsmI site drome was inserted in place of the FP region. These copies were in intronD was used for the upstream cloning site inplace of Tth111I. separated by 11 and 12 base pairs of “nonspecific spacer” seThe structuresof plasmids pPSA and pPSA/FPD are asindicated in Fig. quences designed to be nonpalindromic in nature andto mini4A and were constructed by subcloning a n MboII fragment of pPrlGHE5 andpPrl-GHE5/AB (15)into theBsu36I-EcoRI site of pSVB3Ba. mize potential secondary structure using an RNA secondary Plasmids plPSD and plPSD/FPD (indicated in Fig. 4A) were con- structure prediction algorithm (19-21). To minimize further structed by subcloning a TaqI-EcoRI fragment of bPRL in place of a any influence of the spacer sequences, the triple repeat-conTthlllI-Bsu36I fragment of bGH. Using oligonucleotide 20 (5’-TC- taining fragmentwas inserted inboth orientations. The results CC’ITCCATGCTGGGGGCC-3’) and bGH3‘19 (5”TGCGATGCAAT- of this experiment (lanes 5 and 6 ) indicate that splicing of TTCCTCAT-3’)as the upstream and downstream primers, respectively, intron D was not improved significantly by insertion of three a polymerase chain reaction-derived fragment of the bGH gene was copies of the palindrome relative to insertion of a single copy subcloned as a blunt end-XmaI fragment in place of a HcndIII-XmaI (lane 3 1. These results suggest that theprecise sequence of the fragment of plPSD and plPSDiFPD to generate psPSD and psPSD/FPD, respectively. Plasmids pAEB and pAEB/FPD wereconstructed from palindrome is not required and that simple repetition of the pSVB3Ba and pFPD, respectively, by removing a 14-base pair EspIentire palindrome cannot account for wild-type splicing levels, BsmI fragment. Plasmids containing the consensus splice donor site although multiple copies of the activating sequenceb) may still mutation (pCSD and pCSD/FPD) were constructed by creating a poly- be required (see “Discussion”). merase chain reaction-derived fragment using oligonucleotide mutdoPalindrome Does Not Stimulate Intron D Splicing through nor (5‘-TGTCTATGAGAAGCTGAAGGACCTGGAGGAAGGCATCCTGarereportsindicating GCCCTGATGC(G/A)GGTGtG/A)G(GfP)ATGGCG-3‘) and bGH3‘19 as U l snRNA Complementarity-There that thebinding of U l small nuclearribonucleoprotein to exon upstream and downstream primers, respectively. This fragment was sites of pSVB3Ba and pFPD, then subcloned into the TthlllI andXmaI sequences can influence splicing in either a positive (22) or respectively (see Fig. 4B). Plasmid EWFP (see Fig. 3 A ) contains a 115- negative (23) manner. These exon sequences exhibit complenucleotide FspI-PuuII restriction fragment from bGH exon 5, derived mentarity to U1 snRNA, and inspection of the bGH palindromic from plasmid pbGH-4D5 (17), which was subcloned into the EcoRIsequence in its normal context revealed complementarityto the SmaI sites of pBS-M13(+) (Stratagene). PlasmidE5/FPD (see Fig. 3 A ) U1 snRNA consensus sequence-binding site ((C/A)AGGU(A/ contains a 135-nucleotide EcoRI-SmaI restriction fragment from bGH exon 5 , derived from plasmid pbGH-4D5mPD (17), which was subcloned G)AGU) (24) at 6 out of 9 nucleotides. This level of U1 compleinto the EcoRI-SmaI sites of pBS-M13(+). Plasmids pbGH-4D5IPa1, mentarity is comparable to the examples cited above, suggestpbGH-4D5/G7&, and pbGH-4D5/C7T, were constructed by subcloning ing that U1 small nuclear ribonucleoprotein may bind to the polymerase chain reaction-amplified fragments of pPal, pG7&, and 10-base pair palindrome. To assess the importance of the U1 pC7T,, respectively, into the PstI and EcoRI sites of pBS-M13(+) using snRNA complementarity in bGH exon 5 to intron D splicing oligonucleotides ML1 (5’-CCGGAATTCTGGGAGTGGCACC’ITC-3’) or stimulation, the palindrome was mutated to either enhance and ML2 (5’-GAGCTGCTTCGCATCTC-3’). These plasmids were then disrupt this potential interaction. The palindromic sequence used to createplasmids E5/Pal, E5/G7&, and E5/C7T6, respectively, by 3’ 2 nucleotidesin its normal context (CTTCdeletion of the PstI-SmaI fragment in each vector (see Fig. 3 A ) . Ail plus the adjacent plasmids were confirmed by sequencing. CGGAAGGA) were mutated away from the U1 consensus se-

quence (CCGCAAGCA) or toward the consensussequence (CAGGAAAGT) and inserted into the deleted region of pFPD InitialCharacterization of 10-Base Pair Palindrome That (Fig. 2). As predicted by the above hypothesis, mutation away Stimulates Splicing of bGH Intron %Earlier studies (15)dem- from the U1 binding consensus sequence (lane 4 ) attenuated onstrated thatdeletion of a 115-nucleotide FP sequence within splicing relative to thepalindromic sequence(lane 3). However, the last exon (exon 5) of bGH pre-mRNA results in marked mutation toward the U1 binding consensus sequence also deinhibition of splicing of the upstream intron (intron D) of bGH creased splicing efficiency in comparison to the wild-type palmRNA. A 10-basepair palindromic sequence (CTTCCGGAAG) indromic sequence (lane 5),instead of stimulating splicing as within the FP exon sequence was identified, which when in- predicted. Thus, it appears that theinfluence of the bGH palserted into the deleted region of pFPD, restored splicing t o indromic sequenceon intron D splicing is not mediated through levels that were intermediate between pFPD and wild-type binding of U1 small nuclear ribonucleoprotein to the palinRESULTS

Purine-rich Sequences Compensate for a Weak 5'-Splice Site

6433

A BamHI

Smal

-

Cfo I

I

-I 1 2

3

D

4

cfol

-1 572.

PROBE

m

782 Protected Fragments

274 572

B

E g g 1 2 3 4 5 6

w-

572

-

274

" "

-

-

a

=

-

Oh

1

intron D

4 162

274

"

CCGCAAGCA

17 f 2

0

17?3

5

CAGGAAAGU

34 f 4

\

CITCCGGCCGGAAG CTTCCGGAAG

12 nl

CTTCCGGAAG

CITCCGGAIG

11 nl CTTCCGGMG

11 nt

23

4

CITCCGGAAG

-

7

c~~CCGGAAGGA55

96+1

- 0

iCTTCCGGAAG

7

1

A115

/

96 ? 1

3

303 Fs I Pvu I1

4

5

303 Fs I Pvu II

'10 SPLICED

5

I

intron D

dA115 0 17f3

2

., -

274 -

4 162274

SPLICED

CTTCCGGAAG

56 f 5 53 + 6 54 7 65 5 4

12 nt

FIG.2. Palindrome does not stimulate intron D splicing through U1 snRNA complementarity. Shown is the S1 nuclease mapping of polysomal mRNA isolated from CHO cells transiently expressing the constructs depicted. Species of mRNA corresponding to fully spliced andintronD-containing mRNAs are indicated by the bands a t 274 and572 nucleotides, respectively. Lane 1,pSVB3Ba; lane 2, pFPD; lane 3 , pPalGA lane 4, pU1(4/9); lane 5, pU1(8/9). In the diagrams below (corresponding tolanes 3 5 ) . the nucleotides shown in large bold-face lettersrepresent nucleotides that match the U1 snRNA binding consensus sequence. Lane numbers correspond to the schematic diagrams beneath the autoradiogram.

FIG.1. Multiple copies of 10-base pair palindrome (C'ITCCGGAAG) do not improve splicing over that observed with single copy, and disruption of palindromic sequence by insertion of TABLE I central GGCC does not attenuate intron D splicing. A, S1 nucleQuantification of SI nuclease mapping analysis of pyrimidine- and ase mapping strategy to determine intron D splicing (see "Materials and purine-rich insertions into pFPD Methods"). With the exception of vectors pPSA and pPSA/FPD, every S1 nuclease probe usedcontained the same mutation as the construct Vectors Insert replacing FP sequences Spliced" being analyzed. The shaded boxes, heavy line, and thin line represent % exon, intron, and vector sequences, respectively. B , SI nuclease mapWild-type F P sequence 96 f 1 ping of polysomal mRNAisolated from CHO cells transiently expressing None 17 f 3 the depicted constructs. Species of mRNA corresponding to fully spliced 56 f 5 CTTCCGGAAG and intron D-containingmRNAs are indicated by the bandsat 274 and 15 2 4 CTTCC 572 nucleotides, respectively. Probe lane, untreated probe; tRNA lane, 46 f 8 GGAAG probe mock-hybridized with 10 gg of tRNA Mock lane, probe hybridized 47 f 5 GGAAGGA to RNA from mock-transfected CHO cells; lane 1, pSVB3Ba; lane 2, 67 2 4 GGAAGGAAG pFPD; lane 3, pPal; lane 4, pPalGGCC; lane 5, pTriPal'; lane 6,pTri81 f 4 GGAAGGAAGGAAG Pal-. Lane numbers correspond tothe schematic diagrams beneath the 11 2 4 CCTTCCTTCCTTC autoradiogram. nt, nucleotides. Values represent the mean 2 S.D. from a t least three transfection dromic sequence, although we cannot rule out itsinvolvement experiments.

through a noncanonical interaction(s). Purine-rich Exon Sequence Is Responsible for Positive Influence on Splicing of bGH Intron D-The palindromic sequence (CTTCCGGAAG) contains separate pyrimidine- and purinerich domains. To determine if both of these domains are necessary for the splicing activation, they were inserted separately into the deleted region of pFPD (Table I). Insertion of the pyrimidine-rich domain (CTTCC) had little effect on splicing of bGH intron D compared to pFPD. However, insertion of the purine-rich half (GGAAG) of the palindrome into pFPD markedly stimulated intronD splicing compared to pFPD. Moreover, when the purine sequence was extended to contain multiple tandem copies ofGGAA(G) (Table I), addition of each copy

resulted in further stimulationof splicing. In contrast, extending the pyrimidine-rich sequence in an analogous manner, if anything, reduced splicing relative to pFPD. These data suggest thatonly the GGAAG half of the palindrome is involved in stimulation of splicing and that multiple GGAAG-like sequences within the FP element may be required to produce wild-type levels of intron D splicing. Protein(s)inNuclearExtractsCross-links Specifically to GGAA(G) Repeat-We have previously shown that U V crosslinking FP RNA in splicing extracts resultsin the labeling of a 35-kDa protein (17). To determine if the multiple GGAA(G) repeat interactswith the same35-kDa factor that binds to the

6434

Purine-rich Sequences Compensate for a Weak 5”Splice Site

A

G-1

Schematic representationof bGH exon 5

Exon 5

Cross-linking Substrates (ESIFP) RNA

FIG.3. Protein factorb) cross-links specifically to GGAA(G) repeat. A, RNA substrates used for W cross-linking assays. Darkly shaded boxes denote the FP element. Lightly shaded boxes denote other exon 5 sequences excluding the FP element. B, W cross-linking of”’P-labeled E5/FP(lanes 1 and 2 ), E5Pal (lanes 3 and 4 ) , E5/G7& (lanes 5 and 6),E51 C7T6(lanes 7 and 81,and E5/FPD (lanes 9 and 1 0 ) RNAs in HeLa cell nuclear extract. Additions of ATP and Mg2+(ATPI Mg) are indicated (+ and -). The crosslinked material was analyzed by SDSpolyacrylamidegel electrophoresis after RNase A and T1 treatment as described under “Materials and Methods.”

(EYFPD)

( E Y P a l ) ~ 5 l G 7 A ~ 5 l C 7 T 6 ) GGAAGGAAGGAAG

ClTCCGGAAG

B

EWFP

EYPal

-

CClTCCTTCClTC

EWG7A6 E5lC7T6

-

E5lFPD

” ” -

ATPlMg+

106 80

-

+

+

+

-

+

-

-

49.5 -

32.5 -

entire FP sequence, we synthesized various 32P-labeled RNAs that contained the FP sequence, FP-deleted exon 5, or FPdeleted exon 5 into which was inserted G7&,C7T6, or the 10-base pair palindrome (Fig. 3A).These RNAs were incubated in a HeLa cell nuclear extract in the presence or absence of Mg2+-ATP,UV-irradiated, and treated with RNases A and T1 (Fig. 3B).As observed previously, the FP sequence cross-linked to a 55-kDa and a 35-kDa protein (lanes 1 and 2 ) , whereas FP-deleted exon 5 cross-linked only tothe 55-kDa protein (lanes 9 and 10).Since the 35-kDa protein bound to the FP sequence and not the FP-deleted sequence, insertion of the palindromic, G7&, or C7T6 sequence into FP-deleted exon 5 was designed to determine whether the 35-kDa protein binds specifically to the purine-rich sequences. Somewhat surprisingly, the palindromic sequence did not cross-link specifically to fact that the any protein (lanes3 and 4).This may be due to the palindrome has only one copy of the GGAA(G) sequence, which may not allow the protein to bind with high affinity in vitro. Interestingly, the G7& sequence (lanes 5 and 61,but not the C7T6 sequence (lanes 7 and 81, cross-linked strongly to a protein doublet. This doublet is larger than the 35-kDa protein that cross-linked to theFP sequence (lanes 1 and 2 ) and may or may not be the same protein(s) (see “Discussion”). Weak 5’-Splice Site Is Required for bGH Alternative Intron FP sequence of Retention-We previously reported (15) that the exon 5 is requiredfor bGH intron D splicing, but not for splicing of another constitutively spliced intron. Deletion of the FP sequence, which results ina marked diminution of bGH intron D splicing, has no effect on splicing when bGH exon 5 is placed downstream of heterologous bPRL intron D (15). This suggests that a specific component(s) in bGH intron D necessitates the presence of the FP element in exon 5. To identify this component, the 5’- and 3’-portions of bGH intron D were replaced with corresponding regions of bPRL intron D, and splicing was examined in the presence and absence of the downstream FP sequence (Fig.4).The goal was todefine a small region of bPRL

that could replace a component in bGH intron D and cause splicing of intron D to become independent of the FP sequence. Pre-mRNA in which the 3‘-portion of bGH intron D was substituted with the corresponding bPRL sequences (including both the branch point and splice acceptor site sequences) still required the presence of the downstream FP sequence for splicing (Fig. 4 A , lanes 3 and 4 ). In contrast,when the 5’-portion of bGH intron D was replaced with bPRL intron D sequences, splicing no longer required the FP element(lanes 5 and 6). Inspection of the 5”region of bGH intron D revealedthat the 5”splice site deviates from the consensus sequence and contains an intriguing 21-nucleotide palindromic sequencelocated 49 nucleotides downstream of the 5”splice site that iscapable of forming a perfect 9-base pair stem-loop structure. The stem loop does not appear to be involved in splicing of bGH intron D because disruption of this putative structuredoes not result in splicing that is independent of the FP sequence (Fig. 4 B , lanes 3 and 4 1. In addition, inclusion of the stemloop in the5’-bGW bPRL intron D chimera did not result in restorationof dependence on the FP element (lanes 1 and 2). Mutation of the splice donor site to the perfect consensus sequence, however, resulted insplicing irrespective of the presence of the FP sequence. The splice donor sequence ofbGH intron D (CGG/GUGGGG)matches the mammalianconsensus sequence ((C/A)AG/GU(A/G)AGU)(24)at only six out of nine positions, while the bPRL donor site (CAG/GUGAGC)matches the consensus site at 8 out of 9 nucleotides. We reasoned that bGH intron D may possess a suboptimal donor site, lowering the efficiency of splicing and thereby requiring the additional positive acting signals in theFP element. To test this hypothesis, three point mutations were introducedinto thebGH splice donor site, makingit identical to theconsensus sequence. This mutation resulted in constitutive splicing of bGH intron D, independent of the presence of the FP sequence (Fig. 4 B , lanes 5 and 6). ”

Purine-rich Sequences Compensate for a Weak 5"Splice Site

6435

DISCUSSION

510 -

572

- 588

-

I ,

- 290

I ,

274 bPRL

0 bGH

% SPLICED

intron D

4

274

2

I 96k1

303

Fs I

PYU II

dA115 0 17+3

v

3 -

5

1

236

4

I

9951

0

6f2 > 99

&l

t " 588 '

.558

290, .274 bPRL bGH

70SPLICED

> 99

1 2 I

0

>99

1

9751

There is a growing body of evidence to suggest that sequences other thanthose previously defined at the splice donor, lariat branch point, polypyrimidine tract, and splice acceptor sites play a critical role in theselection of splice sites (reviewed in Refs. 5 and 22). Most of these auxiliary sequences are found within exons, and with few exceptions, the exact nature and role of these sequences in splicing remain unclear. In this report, we provide evidence that a purine-rich sequence within bGH exon 5, even as short as GGAAG, is capable of stimulating intron D splicing. Repetition of this simple sequence restores splicing to near wild-type levels. Furthermore, these results suggest that this GGAAG sequence is part of an "exonic splicing enhancer" (ESE) that functions by compensating for a suboptimal (i.e. weak) splice donor site in intron D in vivo. In this context, we define the ESE as being contained within the 115nucleotide FP fragment and the GGAAG sequence as a core element of the ESE. Mutation of the splice donor site toward the consensus sequence completely eliminates dependence of intron D splicing on the presence of the ESE in exon 5. Stimulation of bGH intron D splicing by the GGAAG sequence offers a n explanation for several other results in this study. Disruption of the 10-base pair palindrome by insertion of a GGCC fragment (Fig. 1 B , lane 4 ) did not alter the GGAAG sequence and therefore did not affect splicing. Mutation of the palindrome away from the U1 consensus sequence inhibited splicing (Fig. 2, lane 41, perhaps because the GGAAGGA sequence was disruptedby insertion of 2 C residues. In contrast, the pre-mRNA containing the mutation of the palindrome toward the U1 consensus sequence was spliced more efficiently than pFPD RNA (Fig. 2, lane 51,presumably due to the fortuitous introduction of a 7-nucleotide GGAAG-like sequence in this mutant. We cannot explain why multiple copies of the complete 10-base pair palindrome did not improve splicing over the stimulation observed with a single copy of this sequence (Fig. 1 B ) , whereas repetition of the GGAAG portion of the palindrome improved splicing over a single GGAAG sequence (Table I). One possibility is that repetition of the entire palindrome also includedthe pyrimidine-rich portion,which may be inhibitory. Alternatively, two of the three copies of the palindrome may base-pair with one another to form a strong secondary structure,which prevents stimulationof splicing above that observed with a single copy. Involvement of purine-rich exon sequences has been suggested in efficient and/or alternative splicing of other pre-mRNAs (22, 25). In these reports, it is argued that the purine-rich sequences are required for the recognition and/or selection of a weak 3"splice site (22,251 and do not compensate for a weak downstream 5"splice site in exon definition (25). However, splicing of bGH intron D is unique in that the influence of the ESE appears to involve the upstream splice donor site. "he results presented here suggest a mechanismof action for a n ESE sequence in which a protein enhances spliceosome complex formation by binding to the ESE and interacting in some fashion with the 5'-splice site rather than exclusively

chimeras depicted. Species of mRNA corresponding fully the to spliced and intron D-containingmRNAs are indicated by the bandsa t 2741290 and 510157Y588 nucleotides, respectively. Lane 1 , pSVB3Ba; lane 2, 4 0 22f3 pFPD; lane 3, pPSA; lane 4, pPSA/FPD; lane 5, plPSD; lane 6, plPSDI FPD. B, S1 nuclease mapping analysis of CHO mRNA from cells con5 do t I > 99 taining mutation of the bGH intron D splice donor site (CGGlGTeG) toward the perfectconsensussequence (CAGIGTGAGT). Species of mRNA corresponding to fully spliced and intron D-containing mRNA 6 0 > 99 are indicated by the bands at 2741290 and 5581588 nucleotides, respecFIG.4. FP element inbGH exon 5 compensates for weak splice tively. Lane 1,psPSD; lane 2, psPSD/FPD; lane 3, pAEB; lane 4, pAEB1 donor site in intron D. A, S1 nuclease mapping analysis of bGH FPD; lane 5,pCSD; lane 6, pCSDFPD. Lane numbers correspond to the mRNA isolated from CHO cells transiently expressing the bGHhPRL schematic diagrams beneath the autoradiogram.nt, nucleotides. -17-

"

6436

Purine-rich Sequences Compensate for a Weak 5”Splice Site

with the 3“splice site, as appearsto be the case in other sys- splicing (i.e. intron retention) of bGH intron D results from a tems (22, 25). balanced interplay between a weak 5”splice donor site and a Formation of a spliceosome complex is believed to commit a n downstream exonic splicing enhancer sequence. A GGAA(G) intron to the splicing pathway and thereby prevent transport of motif, which represents the core of this ESE, can by itself intron-containing mRNAs to the cytoplasm (1, 6-8). Attenua- substantially enhance intronD splicing. Multiple copies of this tion of 5’- and/or 3”splice sites may provide a mechanism short motif are probably necessary to provide the full ESE whereby assemblyof early spliceosome commitment complexes effect. A similar balanced interplay between a weak splice site (13) is rendered inefficient, allowing transport of intron-con- and an ESEmay be generally employed to control other altertaining RNAs to the cytoplasm. An enhancer sequence that native pre-mRNA splicing events. increases theefficiency of complex formation could provide an Acknowledgments-We thank Joseph A. Bokar, Edward C. Goodwin, effective means of modulating this process. The dependence of spliceosome complex formation in vitro (17) on the presence of and Ashraf El-Meanawy for critically reading the manuscriptand Kathryn J. Dirksen for typing the manuscript. the FP element containing a n ESE sequence is consistent with this hypothesis. REFERENCES Presumably, the ESE sequence functions through binding of 1. Green, M. R. (1991)Annu. Reu. Cell B i d . 7, 559-599 a trans-acting factor. We previously demonstrated that a 352. Maniatis, T. (1991) Science 251, 33-34 kDa protein(s1 specifically cross-links totheFPfragment, 3. McKeown, M. (1990) Genet. Eng. 12, 139-181 4. Smith, C. W. J., Patton, J. G., and Nadal-Ginard,B. (1989) Annu. Rev. Genet. which contains an ESE and is required for efficient in vitro 23, 527-577 splicing of intron D (17). In the present study, we show that 5. Sterner, D. A,, and Berget, S. M. (1993) Mol. Cell. Biol. 13, 2677-2687 cross-linking of a protein doublet is dependenton the GGAA(G) 6. Chang, D. D.,and Sharp, P. A. (1989) Cell 59,789-795 repeat. Furthermore, the cross-linking of this doublet is greatly 7. Hamm, J., and Mattaj, I. W. (1990) Cell 63, 109-118 8. Legrain, P., and Rosbash, M. (1989) Cell 57,573-583 diminished in SlOO fractions (data not shown), indicative of 9. Cwke, N. E., Ray, J., Emery, J.G., and Liebhaber, S. A. (1988)J. Biol. Chem. 263, 9001--9006 serinelarginine-rich splicing factors. This protein doublet ap10. Crabtree, G. R., and Kant, J. A. (1982) Cell 31, 159-166 pears to be larger than the 35-kDa protein that cross-links to 11. Hampson, R. K., and Rottman, F. M. (1987)Proc. Natl. Acad. Sci. U. S. A. 84, the FP fragment (Fig. 3B). Two possible explanations may ac2673-2677 count for this increased size. The 35-kDa factor and thedoublet 12. Weil, D., Brosset, S., and Dautry, E (1990) Mol. Cell. Biol. 10, 5865-5875 Michaud, S., and Reed, R. (1993) Genes & Deu. 7, 1008-1020 protein(s) may be the same protein, but with altered gel mo- 13. 14. Liebhaber, S. A., Urbanek, M., Ray, J., Tuan, R. S., and Cooke, N. E. (1989) J . bilities due to varying lengths of RNA cross-linked to the proClin. Znuest. 83, 1985-1991 tein following RNase digestion. If the UV-cross-linked material 15. Hampson, R. K., LaFollette, L., and Rottman, F. M. (1989)Mol. Cell. Bid. 9, 1604-1610 is treated withRNase A alone (data not shown), the apparent 16. Sambrook, J., Fritsch, E. E , and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory, Cold Spring difference in mobility between the 35-kDa protein and theproHarbor. NY ~, tein doublet is even greater (compared to digestion with 17. Sun, Q., Hampson, R. K., andRottman, F. M. (1993) J . Bid. Chem. 268, RNase A-resisRNases Aand T1; Fig. 3B ), suggesting that the 15659-15666 tant, purine-rich sequence in E5/G7& RNA is causing the pro- 18. Woychik, R. P., Lyons, R. H., Post, L., and Rottman, F. M. (1984) Proc. Natl. Acad. Sci. U. S. A. 81, 3944-3948 tein doublet to migrate more slowly. Alternatively, different 19. Jaeger, J. A,, Turner, D. H., and Zuker, M.(1989)Proc. Natl. Acad. Sci.U. S. A proteins could bind to theFP element and the GGAA(G)repeat. 8 6 , 77067710 The exact nature of these proteins awaits future characteriza- 20. Jaeger, J. A,, Turner, D. H., andZuker, M. (1990) Methods Enzymol. 183, 281-306 tion. In eithercase, these results are consistent with a model in 21. Zuker, M. (1989) Science 244,48-52 which a protein factor(s) binds to theESE sequence in exon 5 22. Watakabe, A., Tanaka, K., and Shimura, Y. (1993) Gems & Deu. 7, 407418 Kakizuka, A,, Ingi, T., Murai, T., and Nakanishi, S. (1990) J . B i d . Chem. 265, 23. and compensates for the weak bGH 5”splice site inspliceosome 10102-10108 complex formation. 24. Shapiro, M. B., and Senapathy, P. (1987) Nucleic Acids Res. 15, 7155-7174 25. Xu,R., Teng, J., and Cooper, T. A. (1993) Mol. Cell. Bid. 13, 366b3674 In conclusion, we demonstrateherethatthealternative ~~~

~

~