Isolation, Characterization, and DNA Sequence of the Rat ...

15 downloads 0 Views 4MB Size Report
Oct 10, 2005 - ... Tavianini, Timothy E. Hayes$, Marilyn D. Magazine, Carolyn D. Minth$, and .... sequence determined by this laboratory (16) and by Goodman.
THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1984 by The American Society of Biological Chemists, Inc.

Vol. 259, No. 19, Issue of October 10, pp. 11798-11803,1984 Printed in U.S.A.

Isolation, Characterization, and DNA Sequence of the Rat Somatostatin Gene* (Received for publication, April 12, 1984)

Marie A. Tavianini, TimothyE. Hayes$, Marilyn D. Magazine, Carolyn D. Minth$, and Jack E. Dixonlf From the Department of Biochemistry, Purdue University, West Lafayette, Indiana 47907

The gene encodingrat somatostatin hasbeen isolated Studies of somatostatin at the nucleotide level have helped from a X phage gene library. Phage harboring the gene to clarify its biosynthesis. cDNAs for somatostatin-14 have were identified by plaque hybridization using a nick- been sequenced from anglerfish (14), catfish (15), rat (16), translated fragment derived from the cDNA for rat and human (17). These sequences reveal that somatostatin is somatostatin. The transcriptional unit includes exons processed from a preprohormone of about 115 residues, inof 238 and 367 base pairs (bp) separated by one intron cluding a signal sequence and a long “connecting peptide.” of 621 bp. The intronis located betweenthe codons for Theoretical secondary-structure predictions suggest that alGln (-57)and Glu (-56)of prosomatostatin. Analysis though there area considerable number of amino acid substiof the nucleotide sequence 5’to the startof transcrip- tutions between the somatostatin-14 precursors, there is a tion reveals a number of sequences which may be in- high degree of structural relatedness among them (18). cDNAs volved in theexpression of somatostatin. A variant of for variant somatostatinshave also been isolated, includinga the “TATA” box, TTTAAA, lies 26 bp upstream from the startof transcription, and a sequence homologous precursor to a second somatostatin-14 in the anglerfish (19) to the“CAAT”box (GGCTAAT)is 92 bp upstream from and a precursor to a 22-residue somatostatin in the catfish the transcription start. A long alternating purine-py- (ZO), suggesting that there isa somatostatin gene family. We rimidine stretch, (GT)26, which is similar to Z DNA- present here the isolation and DNAsequence of the gene forming sequencesin othergenes, lies 628 bp 5’ to the from the rat and the characterization of sequences flanking the gene. These studiespave the way for a better understandtranscription start and is flanked by small repeats. Hybridization analysis showsthat thisregion is highly ing of the regulation of somatostatin gene expression. repeated in thegenome and thathomologous sequences are locatedapproximately 2 kilobase pairs downEXPERIMENTALPROCEDURES stream from the poly(A) addition site. Southern hyMaterials-Oligo(dT)-cellulose, oligo(dT), the synthetic 17-base bridization of the X clone with probes derived from primer, and the four dideoxynucleotide triphosphates for dideoxy brain or liver poly(A+)RNA demonstrates that another sequencing were purchased from P-L Biochemicals. Nitrocellulose transcribed sequence lies about7 kilobase pairs down- (BA 85) was obtained from Schleicher & Schuell. Avian myeloblasstream from thepoly(A) addition site of the rat soma- tosis virus reverse transcriptase was purchased from Life Sciences, tostatin gene. Analysis of rat DNA suggests that there St. Petersburg, FL. The various DNA-modifying and restriction enmay be restriction-site polymorphisms in or near the zymes were usedaccording to the manufacturers’ specifications. Hybridization Probes and Screening of the Rat DNA Library-The gene or that additional somatostatin-hybridizing serat chromosomal DNA library provided by T. Sargent, R. B. Wallace, quences may exist in thegenome. and J. Bonner (Phytogen, Pasadena, CA) was composed of a partial HaeIII digest of rat liver DNA cloned into bacteriophage X Charon 4A. Screening of the library was carried out as described by Benton and Davis (21). The 428-bp’ XbaI-SauSAI fragment derived from the Somatostatin, a cyclic tetradecapeptide hormone, was first rat somatostatin cDNA plasmid pRT B1-63 (16) wasused asa discovered as agrowthhormone-release inhibitory activity hybridization probe after nick translation with [cY-~*P]~CTP and from ovine hypothalamus (1).Since its discovery in the hy- DNA polymerase I. Positive plaques were purified by four further pothalamus, it hasbeen found in thedigestive tract (2,3), the cycles of plating at lower densities. Phage DNAwas isolated as thyroid (4), and other parts of the nervoussystem(5, 6). described by Maniatis et al. (22). Analysis of Clones-Restriction digests of recombinant phage DNA Through varied mechanisms, it can inhibit the secretion of a were analyzed by blot hybridization of fragments after electrophoresis number of peptidehormones (7, 8). Somatostatin is also on 0.7 or 1.5% agarose gels and transfer to nitrocellulose (23). Hyabundantly distributed throughout the brain (9) and has been bridizations were performed at 65 “C as previously described (16) identified in primary sensory neurons (10) and parasympa- using 1 X lo6 cpm of the XbaI-Sau3AI fragment of rat somatostatin thetic neurons ( l l ) , leading to the suggestion that it may cDNA as the hybridization probe. Construction of Recombinant Plasmids-A 4.5-kb SalI fragment of function as a neurotransmitter (12, 13). the rat somatostatin X clone was subcloned into the unique SalI site of the vector pBR322.DNAfrom one colony (pRSh3-35) which * This work was supported in part by Grant AM 18024 from the hybridized to the XbaI-Sau3AI fragment of rat somatostatin cDNA National Institutes of Health. This is Journal Paper 9854 from the was selected for further analysis. In addition, a 1.1-kb HindIII-EcoRI Purdue University Agricultural Experiment Station. The costs of fragment of the h clone overlapping the 5’ end of the 4.5-kb Sal1 publication of this article were defrayed in part by the payment of fragment was identified by hybridization to the nick-translated 180page charges. This article must therefore be hereby marked “aduer- bp SalI-BglII fragment at the5’ end of the SaZI fragment and cloned tisement” in accordance with 18 U.S.C. Section 1734 solelyto indicate this fact. Y To whom reprint requests should be addressed. i Predoctoral trainee SuDDorted by National Institutes of Health The abbreviations used are: bp, base pair; kb, kilobase pair; AMV Grant GM 07211. avian niyeloblastosis virus. § Supported by Postdoctoral Training Grant AM0734021.





11798

Rat Somatostatin Gene into pUC8.DNA was prepared from the various colonies by an alkaline lysis method (24) and analyzed by restriction mapping. One colony (pRSXHE5) was selected for further analysis. DNA-sequence Determinations-DNA-sequence analysis was carried out by either the chemical-degradation method (25) or the dideoxy chain-termination method (26, 27). Fragments for chemical sequencing were labeled either with polynucleotide kinase and [y32P]ATPor the appropriate [ L U - ~ ’ P ] ~ N and T PAMV reverse transcriptase. Fragments sequenced by the chain-termination method were cloned into the SmaI site of bacteriophage M13 m p l l and analyzed using a synthetic 17-base primer. DNA sequence data were analyzed using the modified computer programs of Sege et al. (28). Primer-extension Analysis-Primer-extension analysis of the 5’ end of somatostatin mRNA was performed according to Hernandez and Keller (29). The primer was prepared from the 125-bp BglII-XbaI fragment of pRSX3-35. This fragment was labeled at its5’ ends with [y-32P]ATP and polynucleotide kinase, and subsequently cleaved with Hinff to give a 30-bp Hinff-XbaI primer fragment. Total RNA was isolated according to theguanidinium thiocyanate procedure of Chirgwin et al. (30). 6 X lo‘ cpm of primer were mixed with 40 pg of rat medullary thyroid carcinoma RNA, hybridized, and transcribed with AMV reverse transcriptase as described (29). The coding strand of the BglII-XbaI fragment was sequenced according to Maxam and Gilbert (25) and used as size standards. Aliquots of the sequencing and primer-extension experiments were analyzed by electrophoresis on 12% polyacrylamide, 7 M urea gels according to Maxam and Gilbert (25). Southern Blotting of Rat Genomic DNA and Analysis for Repeated Sequences-Rat genomic DNA was isolated from different tissues by the method of Blin and Stafford (31). DNA was digested overnight with various restriction enzymes (10 units/pg). Electrophoretic analysis of 5 pg of digested DNA on 1.5% agarose gels followed bytransfer to nitrocellulose was carried out as described (15, 23). The hybridization probe used was 1-2 x lo7 cpm of a nick-translated mixture of the 1.23-kb Sua-AuaI and 1.65-kb XbaI fragments of pRSX3-35. As a comparison, phage DNA wastreated in the same manner. For analysis of repeated sequences, restriction digests of either plasmid or phage DNA weresubjected to electrophoretic analysis, transferred to nitrocellulose, hybridized, and washed as described under “Analysis of Clones,” using 1 X 10’ cpm of nick-translated rat liver DNA as a hybridization probe. RNA Isolation and Hybridization of Additional Transcribed Sequences in the Rat Somatostatin X Clone-Total RNA was isolated from rat brain or liver by the guanidinium thiocyanate procedure of Chirgwin et al. (30). RNA was enriched for poly(A)-containing sequences by two passages over oligo(dT)-cellulose (32). One pgof poly(A+) RNAwasused as a template for oligo(dT)-primed [32P] cDNA synthesis by AMV-reverse transcriptase (15). Southern hybridization of digested phage DNAwas performed according to Thomas (33), using 5 X lo6 cpm single-stranded [32P]cDNAas the hybridization probe. RESULTS

11799

s 0

2

4

-

3‘

6 8 1 0 1 2 KILOEASE WlRS

14

FIG. 1. Map of the rat somatostatin gene. Restriction sites in the Ab clone were determined by a combination of Southern hybridization of digested DNA and partial digestion of end-labeled fragments. The 5’-to-3’ orientation of the somatostatin gene within the clone is indicated. The two exons of the gene are indicated by solid boxes.

contained within a single 4.5-kb SalI fragment (Fig. 1). To facilitate sequence analysis of the gene, this fragment was cloned into theSalI site of pBR322, giving rise to theplasmid pRSX3-35. Additional mapping showing that a 1.1-kb HindIIIEcoRI fragment, which overlapped the 5’ end of the SalI fragment, harbored the first exon of the gene and 750 bp of sequence 5’ to thetranscriptional start site, including 500 bp which were not present in the SalI fragment. This HindIIIEcoRI fragment was cloned into pUC8,giving rise to the plasmid pRSXHE5. Primary Structureof the Rat Somatostatin Gene-We have sequenced 2021 bp of DNA from the rat somatostatin gene containing the two exons (238 and 367 bp), the intron of 621 bp, 748 bp 5‘ to the startof transcription, and 47 bp 3‘ to the site of poly(A) addition (Fig. 2). The ratsomatostatin mRNA sequence determined by this laboratory (16) and by Goodman et al. (35) differs in only two nucleotides from the genomic sequence. The amino acid sequence is not changed by these differences. The start of transcription as determined by primer-extension experiments occurs 100 nucleotides 5’ to theAUG translation-initiation codon of rat preprosomatostatin (Fig. 3). When the 1.0-1.5 nucleotide adjustment is made to correct the difference in migration between the products of reverse transcriptase-mediated cDNA synthesis and Maxam-Gilbert chemical cleavage (36), the transcription-initiation site is the second adenine residue located within the sequence ATAGC (nucleotide +l; see Fig. 2). The fact that transcription initiates at anadenine is consistent with the finding described by Breathnach and Chambon (37) that the majority of RNA polymerase 11-dependenttranscription initiatesat anadenine. There is an additional band of lower intensity which is one nucleotide larger than themajor reverse transcript. This may arise as a result of the structural features of the cap site at the 5’ terminus of the mRNA. The intron divides the gene between the codons for Gln (-57) and Glu (-56) of the prohormone. The sequence of the donor and acceptor junctions of the intron closely resembles the consensus sequence described by Chambon (37):

Isolation and Characterization of the Rat Somatostatin Gene-A 428-bp XbaI-Sau3AI fragment of the rat somatostatin cDNA (16), which contains the sequence coding for rat preprosomatostatin (including 47 bp 5’ to the initiator methionine and 30 bp 3‘ to the translation-terminationcodon), was nick-translated and used to screen the rat genomic DNA library constructed by Sargent et al. (34). The library was prepared from a partial HaeIII digest of Sprague-Dawley rat DNA cloned into X Charon 4A. 2 X lo6 plaques, representing 5‘ ~ A G G U. . .. .AGG 3‘ consensus sequence six genomes, were screened with the 32P-labeledprobe, and 48 positive plaques were identified. Six of these plaques were CAGGTA . . XAGGAA somatostatin Glu Gln gene selected for further purification and phage DNA isolation. Each phage was analyzed by restriction digestion with EcoRI The consensus sequence AATAAA is found approximately and by Southern hybridization of the EcoRI digests with the 17 bp upstream from the poly(A) addition site at the 3’ end nick-translated probe. Five of the plaque isolates contained of the gene. The sequence TTTAAA, which is a variant of the 15-kb inserts which hybridized with the probe and had iden- Goldberg-Hogness or “TATA” box, is found between -26 and tical EcoRI restriction patterns; the sixth clone was a trun- -31 bp from the transcriptional start. The sequence cated form of the other five. All isolates appeared to contain GGCTAAT is found between -92 and -98 bp from the the same gene, and one clone (Ab) was analyzed in further putative transcriptional start site; this sequence is homolodetail. It was determined that the r a t somatostatin gene was gous to the“CAAT” box, which has been shown to be involved

Somatostatin

11800

Rat

Gene ..................................................

cgatccccg~g~cc~ccccccaga~ccgcccccaggc~ccaaacgc.g~ccccccccccccccccccccccgcgcgcgcgcgcgcgcgcgcgc~cgcgcgcgcgcgc~cgcgcgc~cgcgccc~cccgcc

- 700

c~cCcgcCcgcCc~cCcgCccgccccccc~gcgccccccccccgccacaacacaaagaccagcagaccggacaaagcgacgccccctcacagcccagccccacccccccccccacaaggcccca.ggg.cgccasga~agaagac~gccc

-500

-600

.........

.........

~~cccccccgacc~cacacc~~ccccc~gsccggcccccagacggaca~ccccaagccccccccgccacacaacaccgccaagcatgatggcaa~ccca~ca~cccgagcacaccg~ca~gcacccaaccgcgcgcgc~gacgcatcgcC -400

Gln C

0

-80 V a l Thr G 1 A l a P r o S e r A # ATC GTC CTG CCT T I C C GGl z Cl{ GG GTC ACC L X 6 CCG CCC TCC

Ala Leu Ala Ala Leu C s 1le Val Leu Ala Leu

CAG X C GCG CTC CCC CCC CTC T&

CAE PCCCr o ArAGf

L a Ar

-60

Gln Phe Leu G l n L e Ser Leu Ala Ala A l a Thr G l

CTC CC! CAG T I Cn: CAC

d G TCT

CTG CCG CCT CCC ACC

Gd

+zoo L a Gln A t CAC

gCaag~aaacggccgg~acccgccccccccgcg~accccccagccccccccccagccccgccgcagcccccgcgacaggcgccccagcgggc~cccccc~gagccgcccagcccccg~gcccccagggaaacctccgaa

+300

~tcta~~~cccgccccc~cCcgccccagaaccgaccggcgccggc~gccaccccgcag~caa~ccccccccccgcccccagsaaaaccccgaaagcccgcaagagagcggggagasaccgasccccatccccsgcaccs~cacgas~~cc 400

+so0

c h L ~ Ula L s

tcccacccccccccctgctccccccccc~ccccacccag

C M CTG CCC d C

4 Glu P r o

T r P h e Leu A l a Glu Leu Leu Se d C R C TTG GCA G M CTG CTC

-40 Clu A m

A m Cln Thr

GAG F C M C CAG ACA GAG M C

Leu Glu P r o GlU As GAT Ala CCC C K GAG CCT GAG G A!

AS

+goo

Leu Pro Gln A l aA l a

Wec Ar Leu Glu LeuGln Ar S e r Ala Asn S e r GA! Clu GAG ATC d CTG GAG CTC CAG d TCT GCC M C TCG

Glu C l n As

T T G CCC CAC CCA GCT GAG CAG

-20

Aan P r o Ala Wet A l a P r o Ar MC CCA CCC ATC CCA CCC

-1 +I

Ar L e Ala Cly C s L a Ann PhePhe d Glu GM dA GCT GGC dC K G M C TIC R C

+ 1000 Tr

r&

Lys ThrPheThrSer C a AAG ACA RC ACA TCC &T

TAGC~MTATTGTTGTCTCAGCCACACC~C~ATCCCTCTCC~~CCC~TATCTCTTCC~AACTCCCAGCCCCCCCCCCCM~CTCMCTAGACCCTGC~AG~C

+ 1100 C C M G A C T G T ~ T A C ~ ~ T T A T C G T G ~ T T A T aacagcgsgcgcccgaccccccaccgagcaaacc~ccccgcccagga G +1200

FIG.2. DNA sequence of the rat somatostatin gene. Nucleotides found in mature messenger RNA are capitalized; nucleotides in flanking and intervening sequences are lower case. Nucleotides are numbered in italics with the start of transcription designated as + l . Amino acids arenumbered with the firstaminoacidof somatostatin-14 designated as + l . The TATA and CAAT boxes and the poly(A) addition site are underlined by solid lines. Alternating purine-pyrimidine sequences (potential Z DNA sequences) are underlined by dotted lines. Two changes in sequence from somatostatin cDNA are boxed.

tion of genomic DNA (Fig. 4, left). Comparison of the restriction digests of genomic DNA with those of the Ab recombinant (Fig. 4, right) indicates that all hybridizing fragments are A identical in all of the restriction-enzymedigests except EcoRI and XbaI. In these cases, there is an additional hybridizing T fragment in genomic DNA (a 1.5-kb EcoRI fragment and a +A 0.3-kb XbaI fragment) which is not present in theAb recomG binant. These additional restriction fragments are present in DNA obtained from liver, kidney, pancreas, and medullary thyroid carcinomatissues. Repetitive Sequences Flunkingthe Rat SomatostatinGeneThe location of highly repetitive sequences within a clone 3‘ may be found using by totalgenomic DNA asa nick-translated mRNA probe (38). Three discrete regions of the Ab clone hybridize G GtA C CtT R N A with thisprobe: a 420-bp HindIII-RsaI fragmentat the 5’ end FIG.3. Primer-extensionanalysis of rat somatostatin of the gene, a 1-kb BglII-SalI fragment 2 kb downstreamfrom mRNA. Primer-extension and sequencing reactions were performed the poly(A) addition site (Figs. 5 and 6 ) , and a 5-5-kb SdIas described under “Experimental Procedures.” The sequence of the EcoRI fragment downstreamfrom the gene (data notshown). noncoding (RNA) strand is shown next to the sequencing ladder of The HindIII-RsaI fragment contains the oligomer (GT)25 and the complementary strand (from which the primer was derived), with the sequences (CCT)aC(CCT)2 and (CTGT)&TAT(CTGT), the major start of transcription indicated by the arrow. flanking, respectively, the 5’ and 3’ ends of the (GT),, oligomer (Fig. 2). Because of recent speculation that alternating in regulating thelevel of transcription of many genes (37). purine-pyrimidine sequences can form Z DNA and canplay a The size and numberof somatostatin-hybridizing sequences role in gene regulation,we have examined the rat somatostatin within the rat genome were analyzed by Southern hybridiza- gene for other GT-rich sequences by Southern hybridization

5’

E\

Gene

Rat Somatostatin

11801

e.? -

A

a7

B

"

u -

Iw 20-

a17 1.4

1.1

an am

A

B

C

D

E

F

F

FIG.4. Southern hybridization of total genomic DNA and Ab DNA.Left, Southern blot of genomic DNA. Fivepg of Wistar rat medullary thyroid carcinoma DNA weredigested with the appropriate enzyme, electrophoresed on a 1.5% agarose gel, transferred to nitrocellulose, and hybridized to a nick-translated probe derived from the somatostatin gene (see "Experimental Procedures"). Right, Southern blot of Ab DNA. One pgof Ab DNA wasdigested with the appropriate enzyme, electrophoresed, and hybridized under the same conditions as the genomic DNA. Enzymes used in both cases were: A, EarnHI; B, EcoRI; C, HindIII; D, PstI; E, SstI; F,XbaI. The HindIII digest of X DNA was used as a standard.

x(? ? Y ;D;D

FIG.5. Schematic comparison and structural features. Selected restriction sites are indicated on the composite map t~ facilitate alignment of the cDNA and gene. The extent of the two clones are indicated above the map. Location of introns, exons, and pertinent structural features are asshown.

2JxJ

Y Y m a l

FIG. 6. Analysis of repeated sequences in the rat somatostatin gene.A, 5%acrylamide gel ofpRSAHE5 restriction fragments; B, 5% acrylamide gel of pRSX3-35 restriction fragments. The restriction fragments depicted in the leftportions ofA and E were transferred to nitrocellulose and hybridized to nick-translated rat genomic DNA as shown in the right portions of A and E. The sizes of a Hinfl digest of pBR322 are indicated. Restriction enzymes used are asfollows: HR, HindIII-RsaI; S-R, Sd-RsaI; S-E, SalI-EglII.

have examined the Xb clone for the presence of additional transcribed sequences in proximity to the rat somatostatin with the smallest restrictionfragment which contains the gene. A single-stranded cDNA probe was prepared from (GT), oligomer (a 125-bp BanI fragment withinthe HindIII- poly(A+) RNA obtained from both brain and liver and hyRsaI fragment). The 1-kb BgZII-SalI fragment, which was bridized to digests of the X clone; in both cases, only the 5.5shown above to contain arepetitive element, hybridizes to the kb SalI-EcoRI fragment gave a positive result (Fig. 7). This BanI probe (data notshown). Thus, sequences homologous to fragment is not part of the processed somatostatin transcript those found within the BanIfragment arelocated both 5' and and may represent an additional transcription unit down3' to the somatostatin gene. We have shown by S1 nuclease stream from the poly(A) addition site of the ratsomatostatin treatment and by topological analysis that the (GT)26 se- gene. It may be recalled that this fragment also contains a quence in the plasmid pRSXHE5 forms Z DNA under approx- sequence which is highly repeated in the ratgenome. In order imately physiological conditions of ionic strength and super- to determine if this sequence is similar to the "identifier helical density.* These data suggest that sequences with the sequences" reported by Sutcliffe, a PstI insert from the plaspotential of forming Z DNA structures closely flank the rat mid p2A120(kindly provided by J. G. Sutcliffe, Scripps Clinic, La Jolla, CA) was nick-translated and hybridized to the rat somatostatin gene. The converse experiment for identifying repetitive se- somatostatin X clone. Under the condition employed (see quences within a piece of genomic DNA involves hybridizing "Experimental Procedures"), no hybridization was observed. a labeled restriction fragment to genomic DNA digested with This suggests that thetranscribed element described above is a restriction enzyme. When the 5.5-kb SalI-EcoRI fragment not identical to thatreported by Sutcliffe et al. (39) and that of the Ab recombinant (which is downstream from the poly(A) other regions of the rat somatostatin X clone do not contain addition site of the gene) is hybridized to genomic DNA identifier sequences as described by Sutcliffe. digested with EcoRI, the probe hybridizes to many different DISCUSSION size classes of DNA (data not shown). This indicates that a sequence dispersed through the genome is hybridizing to this These studies describe the isolation, characterization, and restriction fragment. However, the BanI probe does not hy- sequence of the rat somatostatin gene. A comparison of the bridize tn this fragment, which implies that this region does rat cDNA and corresponding DNA sequences within the gene not contain the same repetitive sequence which more closely reveal two nucleotide differences. These differences are both flanks the gene (data not shown). located in the second exon, and neither leads to a difference Additional Transcribed Sequences in the Rat Somatostatin in the resulting amino acid sequence from that reported in Clone-Sutcliffe et al. (39) have described specific identifier the cDNA (Fig. 2). These changes may have arisen as cloning sequences which are present in RNAs in the brain. Because somatostatin is synthesized in several regions of the brain, we * T. Hayes and J. Dixon, manuscript in preparation.

Rat Somatostatin Gene

11802

tostatin genes. Anglerfish islet cells contain two distinct messenger RNAs for somatostatin, onewhich encodes a somatostatin-14 identical to that found in mammals and another which encodes a related but different somatostatin-14 (14, 19). Catfish islet cells also contain two distinct mRNAs, one for somatostatin-14 and onefor a related 22-residue somatostatin (15, 20). These data suggest that there is a family of somatostatin genes, although only one member of the family has been identified so far in mammals. We have identified several sequencesin the upstream flanking region that maybe involved in the expression of the somatostatin gene. These include sequences homologous to the TATA box and to the CAAT box, which are thought to regulate initiation of transcription (37). Additionally, there are other sequences in this region which may be involved in transcription regulation. Cochet et al. (41) have pointed out homologous sequences of about 20 bp which are upstream from several mammalian genes which respond to glucocorticoids in vivo and in uitro. Similarly, it has been noted that a number of steroid-induced chicken egg-white proteins share a common 9-nucleotide sequence upstream from the start of transcription (42); the progesterone receptor bindsselectively to this region of the ovalbumin gene in uitro(43). We have s *E S S - E found three sequences upstream from the start of transcription of the rat somatostatin gene which show considerable homology to these proposed regulatory sites: two 13-bp sequences located between -484 to -472 and -39 to -27 bp have, respectively, 9 and 10 bp identical to the 13 bp at the 3‘ end of the putativeglucocorticoid-regulatory region of the human pro-opiomelanocortin gene and a 9-bp sequence be5-3’ tween -260 and -268 bp has7 bp identical to the 9-bp steroidFIG. I. Additional transcribed regions in theAb clone. Top: regulatory sequences of chicken egg-white protein genes (see A, 1.5% agarose gel of Ab restriction fragments;B, Southern blot of Fig. 2). Very little is known about the regulation of somatorestriction fragments.The agarose gel in A was transferred to nitro- statin transcription, including the possible role of hormonal cellulose and hybridizedto a reverse-transcribedcDNA probe derived control, so the significance of these sequences awaits experifrom poly(A+)RNA from rat brain.The Hind111 digest of A DNA was mental clarification. used as a standard. Restrictionenzymes are as follows: S, Sun; S-E, Another intriguing feature of the 5”flanking region is the SulI-EcoRI.Bottom: restriction map of Ab, indicating position of the rat somatostatin gene (B)and the additional transcribed sequence sequence (GT)z5 which is found between -677 and -628 bp upstream from the start of transcription and is flanked by (W. additional small repeats. Others haveshown that a (GT), artifacts or may be due to intraspecies variations. There is a oligomer has the potentialof forming a Z DNA conformation single intron dividing the gene between the codons for Gln under physiological conditions and that a (GT), probe hy(-57) and Glu (-56) of the prohormone (Fig. 2). Some genes bridizes to sequences dispersed through many eukaryotic geappear to be divided by their introns into structural domains nomes (44, 45). We have verified by nuclease sensitivity (40); this does not appear to be the case with somatostatin. patterns andtwo-dimensional electrophoresis that the(GT)25 The parts of the preprohormone with known functions are sequence upstream from the rat somatostatin gene can form the signal peptide(residues -102 to -78), the 28-residue Z DNAunder approximatelyphysiological conditions of ionic somatostatin precursor (residues -14 to +14), and the 14- strength and superhelical density.* Experiments with synresidue hormone (residues+1 to +14). The connectingpeptide thetic oligonucleotides suggest that Z DNA does not form a (residues -77 to -15), in which the intronfalls, has noknown typical nucleosomal structure (46). Beyond this, the reversed function, and predictionsof protein secondary structure (18) winding of a 50-bp sequencecould have a significant effecton do not suggest that it forms domains. The intron falls in local superhelical density, since theB-to-Z transition relaxes two supercoils for each 12-bp turn of Z DNA. It is therefore exactly the sameplace in the human somatostatin gene.3 Although our efforts haveonly identified one somatostatin probable that a sequenceof such large size could be important well as a potential gene in the Sprague-Dawley X library, hybridization of ge- as a chromatinstructuralelementas nomic DNA gave evidence of somatostatin-hybridizing frag- binding site for Z DNA-binding proteins. The large number ments which are inconsistent with theXb recombinant (Fig. of copies of this sequence in thegenome also suggests that its 4). These additional fragments are found in genomic digests significance may be structural rather than as a regulatory of DNA from both Wistar and Sprague-Dawley strains. These sequencefor specific genes. Regulatory functionssuchas may be due to allelic restriction-site heterogeneities similar transcriptional enhancementhave been suggested, not for the large (GT), Z DNA-formingsequences, but for smaller Z to those seenby Shen and Rutter3 in the human somatostatin gene. The presence in the rat of additional members of the DNA sequences which occur in pairs separated by about 50somatostatin gene family could also explain theabove differ- 80 bp in a number of enhancer and transcriptional control ences. Other species have shown evidence of multiple soma- sequences from DNA and RNA viruses (47). Analysis of the somatostatin gene sequence for other alternating purine-py$ L. P. Shen and W. J. Rutter, personal communication. rimidine sequences reveals a 150-bp segment from -363 to

A

B

Somatostatin

Rat

-213 bp which contains five potential Z DNA-forming sequences (see Fig. 2). These sequences are the same length as the viral sequences and have no more than one base pair out of the alternating purine-pyrimidine pattern, also like the viral sequences. The first two blocks of sequence are separated from the last three blocks by 44 base pairs, which suggests by analogy that this region may have regulatory significance, possibly mediated by sequence-specific Z DNA-binding proteins (47). Beside the highly repeated Z DNA-forming (GT)25 sequence at the 5’ end, the rat somatostatin gene also has a highly repetitive region at its 3’ end which by hybridization analysis is homologous to the (GT)P5-containing fragmentat the 5’ end. Together, these may define a transcriptional unit in the genome’ A Of these features is found in Fig. 5. As well as being closely flanked by highly repetitive sequences, the rat somatostatin gene appears to be flanked by another transcribed sequence. Single-stranded cDNAs derived from rat brain or liver poly(^+) RNA hybridize to the 5.5-kb SalI-EcoRI fragment 3‘ to the rat somatostatin gene. Because Probes derived from both brain and liver hybridize tothis fragment, it is improbable thatthis region represents a tissue-specific transcript (39). Besides hybridizing to the cDNA probes, this fragment also hybridizes to nick-translated genomic DNA. It is possible that the transcribed sequence and the repeated sequence are the same. This ’ b k b ‘“IEcoRI fragment does not appear to represent an Alu-type sequence since it does not hybridize with the nick-translated insert of a plasmid (BLUR8) containinghuman Alu-type sequences (48) (see “Experimental Procedures” for details of hybridization). The exact nature Of the repeated sequence and the transcribed sequence within the h recombinant and their relationship to the rat somatostatin gene are being investigated.

REFERENCES 1. Brazeau, P.,Vale, W., Burgus, R., Ling, N., Butcher, M., Rivier, J., and Guillemin, R. (1973) Science (Wash. D.C.) 1 7 9 , 77-79 2. Luft, R., Efendic, S., Hokfelt, T., Johansson, O., and Arimura, A. (1974) Med. Biol. (Helsinki) 5 2 , 428-430 3. Polak, J. M., Grimelius, L., Pearse, A. G. E., Bloom, S. R., and Arimura, A. (1975) Lancet I, 1220-1222 4. Parsons, J. A., Erlandsen, S. L., Hegre, 0. D., McEvoy, R. C., and Elde, R. P. (1976) J. Histochem. Cytochem. 24,872-882 5. Brownstein, M., Arimura, A., Sato, H., Schally, A. V., and Kizer, J. S. (1975) Endocrinology 9 6 , 1456-1461 6. Hokfelt, T., Johansson, O., Ljungdahl, A., Lundberg, J. M., and Schultzberg, M. (1980) Nature ( L o n d . ) 284,515-521 7. Reichlin, S. (1983) N . Engl. J. Med. 3 0 9 , 1495-1501 8. Vale, W., Rivier, C., and Brown, M. (1977) Annu. Reu. Physiol. 39,473-527 9. Patel, Y. C., and Reichlin, S. (1978) Endocrinology 102,523-530 10. Hokfelt, T., Elde, R., Johansson, O., Luft, R., Nilsson, G., and Arimura, A. (1976) Neuroscience 1, 131-136 11. Lundberg, J. M., Hokfelt, T., Nilsson, G., Terenius, L., Rehfeld, J., Elde, R., and Said, S. (1978) Acta Physiol. S c a d . 104,499501 12. Rorstad, 0.P., Epelbaum, J., Brazeau, P., and Martin, J. B. (1979) Endocrinology 1 0 5 , 1083-1092 13. Dodd, J., and Kelly, J. S. (1978) Nature (Lond.) 273,674-675

Gene

11803

14. Hobart, P., Crawford, R., Shen, L., Pictet, R., and Rutter, W. J. (1980) Nature ( L o n d . ) 288,137-141 15. Minth, C. D., Taylor, W. L., Magazin, M., Tavianini, M.A., Collier, K., Weith, H. L., and Dixon, J. E. (1982) J. Biol. Chem. 257,10372-10377 16. Funckes, C. L., Minth, C. D., Deschenes, R., Magazin, M., Tavianini, M.A., Sheets, M., Collier, K., Weith, H. L., Aron, D.C., Roos, B. A., and Dixon, J. E. (1983) J. Biol. Chem. 258,87818787 17. Shen, L.-P., Pictet, R. L., and Rutter, W. J. (1982) Proc. Natl. Acad. Sci. U. S. A. 79,4575-4579 18. Argos, P., Taylor, W. L., Minth, C. D., and Dixon, J. E. (1983) J. Biol. Chem. 258,8788-8793 19. Goodman, R.H., Jacobs, J. W., Chin, W. W., Lund, P. K., Dee, P. C., and Habener, J. F. (1980) Proc. Natl. Acad. Sci. U. S. A. 77,5869-5873 20. Magazin, M., Minth, C. D., Funckes, C. L., Deschenes, R., Tavianini, M. A., and Dixon, J. E. (1982) Proc. Natl. Acad. Sci. U. S. A . 79,5152-5156 21. Benton, W. D., and Davis, R. (1977) Science (Wash. D.C.) 1 9 6 , 180-182 22. Maniatis, T., Hardison, R. C., Lacy, E., Lauer, J., O’Connell, C., Quon, D., Sim, G. K., and Efstratiatis, A. (1978) Cell 16, 687701 23. Bitner, M., Kupferer, P., and Moris, C. F. (1980) Anal. Biochem. 102,459-471 24. Birnboim, H. C., and Doly, J. (1979) Nucleic Acids Res. 7, 15131523 25. Maxam, A., and Gilbert, w. (1980) Methods Enzymol. 6 6 , 499560 26. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A . 74,5463-5467 27. Messing, J. (1983) Methods Enzymol. 101, 20-78 28. Sege, R. D., Soll, D., Ruddle, F. H., and Queen, C. (1981) Nucleic Acids Res. 9,437-444 29. Hernandez, N., and Keller, W. (1983) Cell 35,89-99 30. Chirpin, J. M., Przybyla, A. E., MacDonald, R. J., and Rut&r, W. J. (1979) Biochemistry 18,5294-5299 31. Blin, N., andStafford, D. W. (1976) Nucleic Res, 3 , 23032308 32. Aviv. H., and Leder. P. (1972) Proc. Natl. Acad. Sci. U. S. A . 69.

35. Goodman, R. H., Jacobs, J. W., Dee, P. C., and Habener, J. F. (1982) J. Biol. Chem. 2 5 7 , 1156-1159 36. Sollner-Webb, B., and Reeder, R. H. (1979) Cell 18,485-499 37. Breathnach, R., and Chambon, P. (1981) Annu. Reu. Biochem. 50,349-383 38. Bell, G. I., Pictet, R., and Rutter, W. J. (1980) Nucleic Acids Res. 8,4091-4109 39. Sutcliffe, J. G., Milner, R. J., Gottesfeld, J. M., and Lerner, R. A. (1984) Nature ( L o n d . ) 308, 237-241 40. Gilbert, W. (1978) Nature (Lond.)271.501 41. Cochet, M., Chang, A.C. Y., and Cohen, S. N. (1982) Nature (Lond.) 297,335-339 42. Grez, M., Land, H. Giesecke, K., Schutz, G., Jung, A., and Sippel, A. E. (1981) Cell 25. 743-752 43. Compton, J. G., Schrader, W. T., and O’Malley, B. W. (1983) Proc. Natl. Acad. Sci. U. S. A . 80, 16-20 44. Hamada, H., Petrino, M. G., and Kakunga, T. (1982) Proc. Natl. Acad. Sci. U. S. A . 79,6465-6469 45. Nordheim, A., and Rich, A. (1983) Proc. Natl. Acad. Sci. U. S. A. 80,1821-1825 46. Nichol, J., Behe, M., and Felsenfeld, G. (1982) Proc. Natl. Acad. Sci. U. S. A . 79, 1771-1775 47. Nordheim, A., and Rich, A. (1983) Nature ( L o n d . ) 303,674-679 48. Jelinek, W. R., Toomey, T. P.,Leinwand, L., Duncan, C.H., Biro, P. A., Choudary, P. V., Weissman, S. M., Rubin, C. M., Houck, C. M., Deininger, P. L., and Schmidt, C. W. (1980) Proc. Natl. Acad. Sci. U. S. A . 77, 1398-1402