Isolation and Characterization of the Human Chromosomal Gene for ...

1 downloads 0 Views 3MB Size Report
tiary structure has been analyzed by x-ray diffraction (4, 5). In eukaryotic cells ..... enon 5. 500 bD. FIG. 3. The organization of human EF-la chromosomal gene.
THEJOURNAL OF BIOLOGICAL CHEMISTRY

Vol. 264, No. 10, Issue of April 5, pp. 5791-5798,1989 Printed in U.S.A .

0 1989 by The American Society for Biochemistry and Molecular Biology, Inc.

Isolation and Characterizationof the Human Chromosomal Genefor Polypeptide Chain Elongation Factorla* (Received for publication, September 20, 1988)

Taichi UetsukiSG, Ayako NaitoST, Shigekazu Nagata$II**, andYoshito KaziroS From the $Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108, Japan and the [[Osaka Bioscience Institute, 6-2-4 Furuedai, Suita-shi, Osaka 565, Japan

The cDNA for human elongation factor-la was isolated from a cDNA library of human fibroblast cells. Using the cDNA as a probe, a numberof chromosomal genes encoding the human elongation factor-la were isolated. Characterization of the clones by restriction enzyme mapping and nucleotide sequence analysis has revealed that only one of them is an activegene, whereas all of the other genes are processed pseudogenes. The active gene consists of 8 exons and 7 introns spanning about 3.5 kilobases, and the sequence of its exons is completely identical to that of the human elongation factor-la cDNA. The first non-coding exon of 33 base pairs is separated by a 943-base pair intron from the coding exons. The primer extension of human elongation factor-la mRNA has indicated that the transcription ofhuman elongation factor-la gene starts from a C residue, and a “TATA”box was found 24 base pairs upstream of the initiation site. Three and five Spl binding sites are present on the 6’-flanking region and the 1st intron, respectively. Furthermore, one Ap-1 binding site is located in the 1st intron. By using nuclear extracts from HeLa cells, the promoter of humanelongation factor-la gene could stimulate in vitro transcription better than the adenovirus major late promoter.

neries are present, i.e. one in the cytosolic fraction and the other in the mitochondria. As for the elongation factors, EFla functions in the cytosolic fraction with 80 S ribosomes, while mitochondrial EF-Tu (mtEF-Tu) functions in mitochondria with 70 S ribosomes. Cytosolic EF-la was purified from pig liver (6), Artemia salina (A. salina) (7), andyeast (8) and is shown to consist of a single polypeptide chain with M , 53,000. On the other hand, mtEF-Tu which is closer to prokaryotic EF-Tu thancytosolic EF-la, was purified from yeast mitochondria as a protein of M , 48,000 (9). More recently, several groups (10, 11) have found thesaurin a, the major protein of Xenopus h v i s previtellogenic oocytes, is homologous to EF-la and suggested that it may be a stage-specific EF-la. To study the gene structure andregulation of its expression, the genes for EF-Tu and EF-la were isolated from several species. In E. coli, EF-Tu is coded bytwo nearly identical but unlinked genes (tufA and t u p ) (12). Saccharomyces cerevisiae (S. cereuisiae) cells have two genes for EF-la (13-16) and one gene for mtEF-Tu (17). The two genes for yeast EF-la code for a protein of 458 amino acids with an identical amino acid sequence (14). The amino acid sequence of mtEF-Tu coded by the nuclear gene tufh4 is more homologous to E. coli EFTu than yeast cytosolic EF-la (17). More recently, EF-la cDNA was isolated from higher organisms such as A. salina (18) and human (19), and the gene organizations were reported for A. salina (20) and Drosophila melanogaster ( D . Prokaryotic polypeptide chain elongation factor Tu (EF- melanogaster) (21) EF-la. In thisreport, we describe isolation and characterizationof Tu)’ and its eukaryotic counterpart EF-la promote the GTPdependent binding of an aminoacyl-tRNA to ribosomes (1). the human chromosomal gene for EF-la. Although there are EF-Tu was purified from Escherichia coli ( E . coli) as a protein many sequences related to EF-la cDNA in the human geof M , 43,000 (2), and theprimary structure comprised of 393 nome, most of them are found to be processed pseudogenes. amino acids residues was determined (3). Moreover, the ter- One expressed chromosomal gene for human EF-la was isolated andthe nucleotide sequence analysis revealed its intron/ tiary structure has been analyzed by x-ray diffraction (4, 5 ) . In eukaryotic cells, two independent translational machi- exon organization. Furthermore, the promoter of the EF-la chromosomal gene could stimulate in vitro transcription bet*This workwas supportedin part by grants-in-aid from the ter than theadenovirus major late promoter.

Ministry of Education, Science and Culture of Japan. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18U.S.C. Section 1734 solelyto indicate this fact. The nucleotide sequence(s)reported in thispaper hos been submitted to the GenBankTM/EMBLData Bank with accession numberfs) 504616 and 504617. § Present address: National Institute of Neuroscience, NCNP, 41-1Ogawa-Higashi-machi, Kodaira, Tokyo 187, Japan. ll Present address: Dept. of Microbiology, Faculty of Medicine, University of Tokyo, Tokyo, 103, Japan. ** To whom correspondence should be addressed Osaka Bioscience Institute, 6-2-4 Furuedai, Suita-shi, Osaka 565, Japan. The abbreviations used are: EF-Tu, prokaryotic elongation factorTu;mtEF-Tu, mitochondrial EF-Tu; EF-la, eukaryotic cytosolic elongation factor-la; kb, kilobase(s); bp, base pair(s); Hepes, 4-(2hydroxyethy1)-l-piperazineethanesulfonic acid.

EXPERIMENTALPROCEDURES

Materials-Restriction endonucleases were purchased from Takara Shuzo or Toyobo and were used essentially as recommended by the supplier except for the use of 200 pg/ml gelatin instead of bovine serum albumin. Klenow fragment of E. coli DNA polymerase I and T4 DNA ligase were the products of Takara Shuzo, and T4 polynucleotide kinase was obtained from Toyobo. Avianmyeloblastosis virus reverse transcriptase and S1 nuclease were purchased from Seikagaku Kogyo and Pharmacia LKB Biotechnology Inc., respectively. [-y-3’P] ATP was prepared from carrier-free H332P04 (Du Pont-New England Nuclear) as described by Walseth and Johnson (22). [a-3ZP]dCTP (3000 Ci/mmol) and [ C X - ~ ~ P ] U(3000 T P Ci/mmol) were purchased from Amersham Corp. The 18-and 24-mer oligonucleotides used for hybridization and primer extension were synthesized using a DNA synthesizer (Applied Biosystems model 380A). Isolation of Human EF-la cDNA-Plasmid pNKl containing yeast

5791

Chromosomal GeneStructure for Human Elongation Factor-1a TTTTCTATAGGCATTAAAAAGATAAAAAAACTAGTTAAAAATTGTATCTAATAAATTATGTAATTTATGT~.

80

M e t Gly LYS Glu LYS Thr H i s I l e Asn Ile TTTTTCGCAACGGGTTTGCCGCCAGAACACAGGTGTCGTGAAAACTACCCCTAAAAGCCAAA A.rG GGA AAG GAA AAG ACT CAT ATC AAC ATT AAATGAACAAGTTAGCG__--__A_____---_-__T__-______-__--____________________--______ --A 20

1

140

100

60

40

___ ___

120

______ ___ ___ ___ ______

160

180

Val Val I l e Gly H i s Val Asp Ser Gly Lys S e r Thr Thr Thr Gly H i s Leu I l e Tyr LYS Cys Gly Gly Ile ASP LYS A r g Thr I l e Glu GTC GTC ATT GGA CAC GTA GAT TCG GGC AAG TCC ACC ACT ACT GGC CAT CTG ATC TAT AAA TGC GGT GGC ATC GAC AAA AGA ACC ATT GAA T-_ _-C -_A

___ ___ ______ ___ ___ ___ ___ ___ ___ ______ ___ 200

___ ___ ___ ___ ___ ___ ___ ___

______ ___ ___

220

___

___

260

240

Lys Phe Glu Lys Glu Ala Ala Glu Met Gly LYS Gly S e r Phe LYS Tyr Ala Trp Val Leu ASP Lys Leu Lys Ala Glu Arg Glu A r g Gly AAA TTT GAG AAG GAG GCT GCT GAG ATG GGA AAG GGC TCC TTC AAG TAT GCC TGG GTC TTG GAT AAA CTG AAA GCT GAG CGT GAA CGT GGT __G C" -CA __A

___ ___ ___

___

___ ___

280

___ ___ ___ ___

______ ___ ___

300

______ ___ ___ ___ ___ ___ ___

320

___ ___ ___ ___

340

360

ile Thr I l e Asp Ile S e r Leu Trp LYS Phe Glu Thr Ser LYS Tyr Tyr Val T h r I I e 1 1 Asp ~ Ala Pro GIY H i s Arg Asp Phe I l e Lys ATC ACC ATT GAT ATC TCC TTG TGG AAA TTT GAG ACC AGC AAG TAC TAT GTG ACT ATC ATT GAT GCC CCA GGA CAC AGA GAC TTT ATC AAA C" "C

"- "_

"_

-__"_

"_

"_

"_

380

"_

"_

"_

"_ _" "_

_"

400

_"

_" "_ -" "_

_"

420

"_

"_

440

Asn Met I l e Thr Gly Thr Ser Gln A l a Asp Cys Ala Val Leu I l e Val Ala Ala Gly Val Gly Glu Phe Glu A l a Gly lle S e r LYS Asn AAC ATG ATT ACA GGG ACA TCT CAG GCT GACTGT GC'T GTC CTG ATT GTT GCT GCT GGT GTT GGT GAA TTT GAA GCT GGTATC TCC AAG AAT _GT ""A

"_

"_

"_ "_ "_

"_

"_ "_

_"

"_

460

"_ "_

"_

480

"_ _" "_ _" _" "_ _" _" _" 500

"_

"_

"_

"_

520

"_

540

Gly Gln Thr A r g Glu H i s Ala Leu Leu Ala Tyr Thr Leu Gly Val LYS Gln Leu I l e Val Gly Val Asn Lys Met ASP S e r Thr Glu Pro GGG CAG ACC CGA GAG CAT GCC CTT CTG GCT TAC ACA CTG GGT GTG AAA CAA CTA ATT GTC GGT GTT AAC AAA ATG GAT TCC ACT GAG CCA __c __A __T __T _A-

___ ___ ___ ___ ___ ___ ___

___ ___ ___ ___ ___ ___ ___ ___

___

560

580

___ ___

___ ___ ___ ___ ___ ___

600

___

620

Pro Tyr S e r Gln Lys A r g Tyr Glu Glu I l e Val LYS Glu Val Ser Thr Tyr I l e Lys Lys I l e Gly Tyr Asn Pro ASP Thr Val Ala Phe CCC TAC AGC CAG AAG AGA TAT GAG GAA ATT GTT AAG GAA GTC AGC ACT TAC ATT AAG AAA ATT GGC TAC AAC CCC GAC ACA GTA GCA TTT

___ ___ ___ ___ ___ ___ ___ ___ ______ ___ ______ 640

A__

___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ___

660

680

___ ______ ___

A"

700

720

Val Pro Ile S e r Gly Trp Asn Gly Asp Asn Met Leu Glu Pro S e r Ala Asn Met Pro Trp Phe Lys Gly Trp Lys Val T h r Arg Lys Asp GTG CCA ATT TCT GGT TGG AAT GGT GAC AAC ATG CTG GAG CCA AGT GCT AAC ATG CCT TGG TTC AAG GGA TGG AAA GTC ACC CGT AAG GAT _A_ ___A-

___ ___ ___ ___ ___ ___ ___ ___ ___ ___ __-___ ___ ___ ___ ___

___ ___ ___ ___ ___

740

760

___ __-___ ___

780

______

800

Ala LEU Asp Cys I l e LEU Pro Pro Thr Arg Pro Thr Asp L y s Pro LEU A r g Leu Pro L e u Gly Asn Ala S e r Gly Thr Thr Leu Leu Glu GCT GAC TGC ATC CTA CCA CCA ACT CGT CCA ACT GAC AAG TTG CCC CGC CTG CCT CTC GGC AAT GCC AGT GGAACC ACG CTG CTT GAG CTG -7""-

"_ ___ "_

"_ "_ "_

"_

"_

820

"_ "_

_" _" _"

840

_"

_"

_"

860

"_

"_ "_

_" _"

"_

"_

900

880

Gln ASP Val Tyr Lys Ile Gly Gly I l e Gly Thr Val Pro Val Gly A r g Val Glu Thr Gly Val Leu Lys Pro Gly Met Val Val Thr Phe CAG GAT GTC TAC AAA ATT GGT GGT ATT GGT ACT GTTCCT GTT GGC CGA GTG GAG ACT GGTGTT CTC AAA CCC GGT ATG GTG GTCACC TTT --- --- --- "- --- --- -__-__-__ -__-T_ A" _-T

___ _-_

920

___

___ ___

__-___

___ ___

940

___ ___

___ ___ ___

___ ___ ___

980

960

Ala Pro Val Asn Val Thr Thr Glu Val Lys S e r Val G l u M e t His His Glu Ala Leu S e r G l u Ala Leu Pro Gly Asp Asn Val Gly Phe GCT CCAGTC AAC GTT ACA ACG GAA GTA AAA TCT GTC GAA ATG CAC CAT GAA GCT TTG AGT GAA GCT CTTCCT GGGGAC &AT GTG GGC TTC -__ --- -G-_A _AA__

___ ___

__-

1040

___ _________

___ ___ ___ ___ ___ ___ ___ ___

1020

1000

___ ___

___ ___ ___ ___ ___ ___ ___ 1060

1080

Asn Val LYS Asn Val Ser Val Lys Asp Val A r g A r g Gly Asn Val Ala Gly Asp S e r Lys Asn Asp Pro Pro Met Glu Ala Ala Gly Phe AAT GTC AAG AAT GTG TCT GTC AAG GAT GTT CGT CGT GGC AAC GTT GCT GGT GAC AGC AAA AAT GAC CCA CCA ATG GAA GCA GCT GGC TTC

--_ ___

___ -______________

-_____

-A_

___ ___ ___ ___ ___ ___ ___ ___ ___ ___ ______ ___ ___ ___ ___ ___ ___ ___

1100 1120 1140 1160 Thr Ala Gln Val I l e I l e Leu Asn H i s Pro Gly Gln I l e S e r A l a Gly Tyr Ala Pro Val Leu Asp Cys H i s Thr A l a H i s I l e Ala Cys ACT GCT CAG GTG ATT ATC CTG AAC CAT CCA GGC CAA ATA AGC GCC GGC TAT GCC CCT GTA TTG GAT TGC CAC ACG ""G -f"T "A _" "A

"_ "_

1220

_"

"_ "_

"_

"_ "_

_"

_" _"

_" ___

"_ ___ "_

_" _"

_"

GCT

1200

1180

LYS Phe Ala Glu Leu LYS Glu LYS I l e Asp A r g A r g S e r Gly LYS LYS Leu Glu ASP Gly Pro LYS Phe Leu LYS S e r Gly ASP Ala Ala AAG TTT GCT GAG CTG AAG GAA AAG ATT GAT CGC CGT TCT GGT AAA AAG CTG GAA GAT GGC CCT AAA TTC TTG AAG TCT GGT GAT GCT GCC

"_ "_

"_

"_

"_

"_

"_

"_ "_ "_

_"

"_ _""_

_"

"_

_" "- _" _"

"_

_"

"_

"_

"_

"_

"_

1280

Val Glu Ser Phe S e r Asp Tyr Pro Pro Leu Gly Arg Phe A l a ~ V a lArg Asp Met GTT GAG AGC TTC TCA GAC TAT CCA CCT TTG GGT CGC TTT GCT GTTCGT GAT ATG

Ile Val Asp Met Val Pro Gly Lys Pro Met CYS ATT GTT GAT ATG GTT CCT GGC AAG CCC ATG TGT

1360 1400 1380 1420 Arg Gln Thr Val Ala Val Gly Val I l e Lys Ala Val Asp LYS LYS Ala Ala Gly Ala GIY Lys Val Thr Lys S e r Ala Gln LYS Ala Gln

_""_

_" _" ___ ___ _" ___

"_

_"

"_

AGA CAG ACA GTT GCG GTG GGT GTC ATC AAA GCA GTG GAC AAG AAG GCT GCT GGA GCT GGC AAG GTC ACC AAG TCT GCC CAG AAA GCT CAG -" C" "_ -" "_

"_ "_

"_

"_

"_

"_ _" "__"

"_

"_

1540 LYS Ala LYS End 1480 1460 1500 1520 AAG GCT AAA TGA ATATTATCCCTAATACCTGCCACCCCACTCTTAATCAGTGGTGGAAGAACGGTCTCAGAACTGTTTGTTTCAATTGGCCATTTAAGTTTAGTAGTAAAAGACT

___ ______ _" ___"______________~"""""""""""""""""""~"""""""""""""""""----"---1560

1580

1640 1600

1620

1660

GGTTAA~ATAACAATGCATCGTAA]AACCTTCAGAAGGAAAGGAGAATGTTTTGTGGACCACTTTGGTTTTCTTTTTTGCGTGTGGCAGTTTTAAGTTATTAGTTTTTAAAATCAGTAC ~~_~______~_T______A_A______~________________~_______~__~~_______~~__~_~__~__~-~~_~~~---------"--""-""""""""

1680

TTTTTAA TGGAAACAACTTGACCAAAAATTTGTCACAGAATTTTGAGACCCATTAAAAAAGTTAAATGAGAAAAAAAAAA

""""""""_"""_""""""""""""""""A_"""

T------AAAAAAAAAAAAGAAArGAACAAGTTACTTGAAAATATAATTTATAAA

AATGATACAAAAATAAGTAGGAACCACCTGAATTGTCTAATTCTATTTAAGAAATTTATTTTATACTGGAAAAATCTTCCCGCAAGGATCTCCAAGCCA

FIG. 1. Nucleotide sequences of human EF-lacDNA and one of pseudogenes (XEFS). The dashed line indicates identity between the two sequences. The amino acid sequence of human E F - l a was deduced from the nucleotide sequence of EF-la cDNA. The 18-mer oligonucleotide used for the identification of the expressed chromosomal gene is boxed. The 15-nucleotide direct repeats framing the XEFS gene are underlined.

CAC

Chromosomal Gene Structure Human for

Elongation Factor-la

5793

60 mM KCI, 11 mM MgC12, 12% glycerol, 600 p~ each of ATP, CTP, EF-la chromosomal gene (14) was digested with ClaI and HindIII, andthe1-kb ClaI-Hind111 fragment was isolated by agarose gel and GTP, 25 p~ unlabeled UTP, and 5 pCi of [a-R2P]UTP,0.5 pg of electrophoresis. The ClaI-Hind111 fragment was labeled with '*P by DNA templates, and 90 pg of nuclear extract protein. Incubation was nick translation using [ C Y - ~ ~ P J ~(23). CTP a t 30 "C for 60 min, and the reaction was stopped by the addition of 175 pl of 2 M ammonium acetate containing 0.2% sodium dodecyl A human cDNA library constructed withmRNA from human sulfate and5 pg of carrier tRNA. After phenolextraction and ethanol fibroblast GM 637 cells using pCD vector system (24), was kindly provided by Dr. H. Okayama (NationalInstitutes of Health, Be- precipitation, the RNA products were heated a t 90 "C for 3 min and analyzed on 4% polyacrylamide gel containing 8 M urea. As a size thesda, MD). About 40,000 colonies were screened by colony hybridization under low stringency (17) using the "P-labeled ClaI-Hind111 marker, R2P-labeled HaeIII-digestedpBR322 was electrophoresed in fragment of pNK1. Plasmid DNAswere prepared from positive clones parallel. and characterized by restriction enzyme mapping and Southern hybridization analysis. Oneof the full length cDNA clones (1.8kb long) RESULTS was designated as pAN7, and thenucleotide sequence was determined as described (25). Isolation of Human EF-la cDNA and Southern HybridizaSouthern Hybridization Analysis-Human chromosomal DNA was tion Analysis of Human Chromosomal DNA-By using yeast extracted from human leukocyte according to the method of Gross- EF-la cDNA as a probe, we isolated human EF-la cDNA Bellard et al. (26). Each 10 pgof chromosomal DNA was digested with various restriction enzymes, and subjected to 0.8% agarose gel from the cDNA library constructed with mRNA from human fibroblast GM 637 cells. One of the full length cDNA was electrophoresis. DNA fragments were then transferred to a nitroceldesignated as pAN7, and the complete nucleotide sequence lulose filter (Schleicher & Schuell) as described (23) and hybridized with a "P-labeled BamHI fragment of pAN7 containing human EF- was determined (Fig. 1).The coding sequence of pAN7 cDNA la cDNA. Cloning of Human EF-la Chromosomal Gene-Human gene libraries constructed with human fetal liver DNA (27) and human placenta DNA (28) were provided by Dr. T. Maniatis (Harvard University) 1 2 3 and Dr. M. Shibuya (Institute of Medical Science,University of Tokyo), respectively. The libraries were firstscreened by plaque -ori hybridization (23) using the 32P-labeled 2-kb BamHI fragment containing human EF-la cDNA. In some cases, the positive clones were rescreened using the 18-mer oligonucleotide which was labeled at the 5' end using T4 polynucleotide kinase and [y-"PIATP. DNAs were prepared as described (23) and characterized by restriction enzyme mapping and Southernhybridization analysis using 32P-labeled pAN7 cDNA. DNA fragments hybridizing with pAN7 cDNA was subcloned from the X DNA into the appropriate site of pBR327 or pUC119. The -4.9 nucleotide sequence of the EF-lagene region was determined by the dideoxynucleotide chain termination procedure after subcloning into -3.4 M13 mp8 or mp9 (25). Primer Extension-Total cellular RNAwas extracted from human -2.5 HL-60 cells by the guanidine thiocyanate/cesium chloride method as described (23).For the primerextension, the 32P-labeled 24-mer synthetic oligonucleotide (5 pmol) and 5 pg of poly(A) mRNA were denatured together a t 90 "C for 3 min and gradually chilled (14).The -1.4 mixture was then adjusted to contain, in 50 pl, 50 mM Tris-HCI (pH 8.3), 50 mM KCI, 8 mM MgClz, 0.5 mM each of dNTPs, and 40 units of RNase inhibitor (Takara Shuzo). The cDNA was synthesized by incubation at 42 "C for 90 min with 25 units of avian myeloblastosis virus reverse transcriptase, and analyzed by electrophoresis through 8% polyacrylamide gel containing 7 M urea. As a size marker, M13 single-stranded DNA was sequenced by the chain termination procedure and electrophoresed in parallel. In Vitro Transcription-Plasmid pEFgl (see"Results") was diFIG.2. Southern hybridization analysis of human EF-la gene. Genomic DNA from human leukocytes was digested with BglII gested with ApaI or PstI and used as templatesfor in vitro transcription. The nuclear extract prepared from HeLa cellsaccording to (lane 1), EcoRI (lane 2 ) , and HindIII (lane3 ) and electrophoresed on a 0.8% agarose gel. DNA was transferred toa nitrocellulose filter and Dignam et al. (29) was kindly provided by J. Mizushima-Sugano (Institute of Medical Science, University of Tokyo). The in vitro hybridized as described under "Experimental Procedures." A "Ptranscription was carried out essentially as described (29). The reaclabeled BamHI fragment of human EF-la cDNA was used as the tion mixture contained, in a volume of 25 pl, 20 mM Hepes-KOH (pH probe. The DNA size marker was electrophoresed in parallel and is 7.9), 12 mM Tris-HCI (pH7.9), 0.12 mM EDTA, 0.3 mM dithiothreitol, given indicated in kilobases. ori, origin of electrophoresis.

enon 5 enon 1

e n oenn7oenn6oenx4oenn3o n 2

,

500 bD

,

FIG.3. The organization of human EF-la chromosomal gene. Boxes and lines between them represent 8 exons and 7 introns, respectively. The size scale is indicated by a bar of 500 bp length on the upper right side of the figure. Coding sequences are indicated by solid boxes, while non-coding regions are represented by open boxes. The location of the major recognition sites for restriction enzymes are given aboue the gene. The sequencing strategy is shown under the restriction map, and arrows represent the direction and length of sequence determined by each independent experiment.

Chromosomal Gene Structure for Human Elongation Factor-la 20 60 40 100 CCCGGGCTGGGCTGAGACCCGCAGAGGAAGACGCTCTAGGGATTTGTCCCGGACTAGCGAGATGGCAAGGCTGAGGACGGGAGGCTGATTGAGAGGCGAAGGTACACCCTAATCTCAAT 180 140 160 ACAACCTTTGGAGCTAAGCCAGCAATGGTAGAGGGAAGATTCTGCACGTCCCTTCCAGGCGGCCTCCCCGTCACCACCCCCCCCA~~CCGGAGCTGAGAGTAATTCATAC

458200 500 560 * EXON 1 CGCGGGGTAAACTGGGAAAGTGATGTCGTGTACTGGCTCCGCCTTTTTCCCGAGGGTGGGGGAGAACCGTATATAAGTGCAGTAGTCGCCGTGAACGTTCTTTTTCGCAACGGGTTTGC

0

INTRON6 610 640 600 680 CGCCAGAACACAG GTAAGTGCCGTGTGTGGTTCCCGCGGGCCTGGCCTCTTTACGGGTTATGGCCCTTGCGTGCCTTGAATTACTTCCACGCCCCTGGCTGCAGTACGTGATTCTTGA

700

760

740

720

TCCCGAGCTTCGGGTTGGAAGTGGGTGGGAGAGTTCGACGCCTTGCGCTTAAGGAGCCCCTTCGCCTCGTGCTTGAGTTGAGGCCTGGCCTGGGCGCTGGGGCCGCCGCGTGCGAATCT 840

GGTGGCACCTTCGCGCCTGTCTCGCTGCTTTCGATAAGTCTCTAGCCATTTAAAATTTTTGATGACCTGCTGCGACGCTTTTTTTCTGGCAAGATAGTCTTGTAAATGCGGGCCAAGAT 1060

960

CTGCACACTGGTATTTCGCTTTTTGGGGCC~GG~C~~GGGGCCCGTGCGTCCCAGCGCACATGTTCGG~AGGCGGG~CTGCGAGCGCGGCCACCGAGAATCGGACGGGGGTA SPl SUI

1200 CCGGCCCTGCTGCAGGGAGCTCAAAATGGAGGACGCGGCGCTCGGGAGA~GGGCGGGT~GTCACCCACACAAAGGAAAAGGGCCTTTCCGTCCTCAGCCGTCGCTTCAT~~

1480

1440

1460

1540

1500

1520

GACTGAAGTTAGGCCAGCTTGGCACTTGATGTAATTCTCCTTGGAATTTGCCCTTTTTGAGTTTGGATCTTGGTTCATTCTCAAGCCTCAGACAGTGGTTCAAAGTTTTTTTCTTCCAT

EXON 2 1580 TTCAG GTGTCGTGAAAACTACCCCTAAAAGCCAAA ATG GGA AAC GAA AAG ACT CAT ATC AAC ATT GTC GTC ATT GGA CAC GTA GAT TCG GGC AAG Met Gly LYS Glu Lys Thr His Ile Asn Ile Val Val Ile Gly His Val Asp Sei- Gly Lys 1 10 20

1600

1660

1700

1680

TCC ACC ACT ACT GGC CAT CTG TAT ATC AAA TGC GGT GGC ATC GAC A A A AGA ACC ATT GAAAAA TTT GAG AAG GAG GCT GCT GAG GTATGTT Ser Thr Thr Thr GlyHis Leu Ile Tyr LYs Cys GlYG l y Ile Asp Lys Arg Thr Ile Glu Lys Phe Glu LYS GIu Ala Ala Glu 30

40

1780 INTRON 2 1760 TAATACCAGAAAGGGAAAGATCAACTAAAATGAGTTTTACCAGCAGAATCATTAGGTGATTTCCCCAGAACTAGTGAGTGGTTTAGATCTGAATGCTAATAGTTAAGACCTTACTTATG 1860

1960

1940

1900

1920

AAATAATTTTGCTTTTGGTGACTTCTGTAATCGTATTGCTAGTGAGTAGATTTGGATGTTAATAGTTAAGATCCTACTTATAAAAGTTTGATTTTTGGTTGCTTCTGTAACCCAAAGTG 2020

2000

1980

ACCAAAATCACTTTGGACTTGGAGTTGTAAAGTGGAAACTGCCAATTAAGGGCTGGGGACAAGGAAATTGAAGCTGGAGTTTGTGTTTTAGTAACCAAGTAACGACTCTTAAT~~TTAC

EXON 3 2120 ATG GGA AAG GGCTCC TTC AAG TAT GCC TGG GTC TTG GAT AAA CTG A A A GCT GAG CGT GAA CGT GGT ATC ACC ATT GAT ATC TCC TTG Met Gly Lys Gly Ser Phe LYs Tyr Ala Trp Val Leu ASP LYS Leu LYS Ala Glu Arg Glu Arg Gly Ile Thr I l e Asp Ile S e r Leu

2140 AG

50

60

2220

70

2200

TTTAACATC ATG ATT ACAGGG ACA TCT TGG AAA TTT GAG ACC AGC AAG TAC TAT GTG ACT ATC ATT GAT GCC CCA GGA CAC AGA GACAAA Trp LYS Phe Glu Thr Ser LYS Tyr Tyr Val Thr I l e Ile Asp Ala Pro Gly His Arg Asp Phe Ile Lys Asn Met Ile Thr Gly ThrSer 80

90

2300 INTRON 3 GTTGGGATTAATAATTCTAGGTTTCTTTATCCCAAAAGGCTTGCTTTGTACACTGGTTTTGTCATTTGGAGAGTTGACAGGGATATGTCTTTGCTTTCTTTAAAG GCT GAC Ala Asp

2320

GAG GI n

110 2420

EXON 4 2400 TGT GCT GTC CTG ATT GTT GCT GCT GGT GTT GGT GAA TTT GAA GCT GGT ATC TCC GGG AAG CAG AAT ACC CGA GAG CAT GCC CTT CTG GCT H i sGlu Ala Leu Leu Ala CYS Ala Val Leu Ile Val Ala Ala Gly Val Gly Glu Phe Glu Ala Gly Ile Ser Lys Asn Gly Gln Thr Arg 120

2520

140

130

2500

2480

AAA ATG GAT TCC ACT GAG CCACCC TAC AGC CAG AAG AGA TAT GAG GAA ATT TAC ACA CTG GGT GTGAAA CAA CTA ATT GTC GGT GTT AAC TYr Thr Leu Gly Val Lys Gln Leu Ile ValGIY Val Asn Lys Met Asp Ser Thr Glu P r o Pro Tyr Ser Gln LYs Arg Tyr Glu Glu Ile 150

160

170

2580

AAA ATT GGC TAC AAC CCC GAC ACA GTA GCA TTT GTG CCA ATT TGG TCT AAT GGT GGT GAC AAC GTT AAG GAA GTC AGC ACT TAC ATT AAG LYS Lys Ile GIY Tyr Asn Pro Asp Thr Val Ala Phe Val Pro Ile Ser Gly Trp Asn Gly Asp Asn Val Lys Glu Val Ser Thr Tyr Ile 180 2760

Glu 2800

190

2740 27 INTRON 20 4 2700 2660 GTAAGTGGCTTTCAAGACCATTGTTAAAAAGCTCTGGGAATGGCGATTTCATGCTTACACAAATTGGCATGCTTGTGTTTCAG ATG CCT ATG CTG GAG CCA ACT GCT AAC Asn Leu Pro Ser Ala Met

EXON 6 2780 TGG TTC AAG GGA TGGAAA GTC ACC CGT AAG GAT GGC AAT GCC ACT GGA ACC ACG CTG CTT GAG GCT CTG GAC TGC ATC CTA CCA CCA ACT Phe Lys Gly Trp Lys Val Thr Arg LysA s p Gly Asn Ala Ser Gly Thr Thr Leu LeuGlu Ala Leu Asp CYS Ile Leu Pro Pro Thr TrP 220

210

230

INTRON 5 2940 CGT CCA ACT GAC AAG CCC TTG CGC CTG CCT CTC CAG GAT GTC TAC AAA ATT GGT G GTAAGTTGGCTGTAAACAAAGTTGAATTTGAGTTGATAGAGTACT Arg Pro Thr Asp Lys Pro Leu Arg Leu Pro Leu Gln Asp Val Tyr Lys Ile Gly G 2860 2900

240

2880

250

FIG. 4. Nucleotide sequence of human E F - l a chromosomal gene. The coding sequence of exons is translated andnumbered from the ATG initiation codon. The TATA box on the 5"flanking region and "ATTAAA" polyadenylation signal are underlined. The initiation sitefor transcription and thepolyadenylation site are marked by *. The putative S p l binding sites andApl binding site areboxed in the 5"flanking region and thefirst intron.

0

Chromosomal Gene Structure for

Human Elongation Factor-1

5795

a

2980 3 0 4EXON 0 6 3020 2960 GTCTGCCTTCATAGGTATTTAGTATGCTGTAAATATTTTTAG GT ATT GGT ACT GTT CCT GTT GGC CGA GTG GAG ACT GCT GTT CTC AAA CCC GGT ATG ly Ile Gly Thr Val Pro Val Gly ArgVal Glu Thr GlY Val Leu LYS Pro GlY Met 260 3080

3060

3140

270 3120

3100

GTG GTC ACC TTT GCT CCA GTC AAC GTT ACA ACG GAA A A A GTA TCT GTC GAA ATG CAC CAT GAA GCT TTG ACT GAA GCT CTT CCT GGG GAC Val Val Thr Phe Ala Pro Val Asn Val Thr Thr Glu Val Lys Ser Val Glu Met H i s H i s Glu Ala Leu Ser GIu Ala Leu Pro GlY ASP 280

290

300

3160

AAT GTG GGC TTC AAT GTC AAG AAT GTG TCT GTC AAG GAT GTT CGT CGT GGC AAC GTT GCT GGT GAC AGC AAA AAT GAC CCA CCA ATG GAA Val Ala Gly Asp Ser Lys Asn ASP Pro Pro Met Glu Y Asn Val Gly Phe Asn Val Lys Asn Val Ser Val LYS Asp Val Arg Arg G ~ Asn 310

320

INTRON 6 3280 3300 3340 3320 GCA GCT GGC TTC ACT GCT GTAACAATTTAAAGTAACATTAACTTATTGCAGAGGCTAAAGTCATTTGAGACTTTGGATTTGCACTGAATGCAAATCTTTTTTCCAAG CAG Ala Ala Gly Phe ThrAla Gln 3240

340

3360 3380 EXON3 470 0 GTG ATT ATC CTG AAC CAT CCA GGC CAA ATA AGC GCC GGC TAT GCC CCT GTA TTG GAT TGC CAC ACG GCT CAC ATT GCA TGC AAG TTT GCT Val Ile Ile Leu Asn H i s Pro Gly Gln Ile Ser Ala Gly Tyr Ala Pro Val Leu Asp Cys His Thr Ala H i s Ile Ala Cys Lys Phe Ala 350 3480

360 3460

3440

GAG CTG AAG GAA AAG ATT GAT CGC CGT TCT GGT A A A AAG CTG CAA GAT GGC CCT AAA TTC TTG AAG TCT GGT GAT GCT GCC ATT GTT GAT Glu Leu Lys Glu Lys Ile Asp Arg Arg Ser Gly Lys Lys Leu Glu Asp Gly Pro Lys Phe Leu Lys Ser Gly Asp Ala Ala Ile Val As 380

390

INTRON 7 3600 ATG GTT CCT GGC AAG CCC ATG TGT GTT GAG AGC TTC TCA GAC TAT CCAG GTAAGGATGACTACTTAAATGTAAAAAAGTTGTGTTAAAGATGAA CCT TTG Met Val Pro Gly Lys Pro Met Cys ValGlu Ser Phe Ser Asp Tyr Pro Pro Leu G

3560

3540

3620

410

420

3660 EXON 3 7 2 08 3700 3640 AAATACAACTGAACAGTACTTTGGGTAATAATTAACTTTTTTTTTAATAG GT CCC TTT GCT GTT CGT GAT ATG AGA CAG ACA GTT GCC GTG GGT GTC ATC ly Arg Phe Ala Val Arg Asp Met Arg Gln Thr Val Ala Val Gly Val Ile 430 3780

3760

3740

AAA GCA GTG GAC AAG AAG GCT GCT GGA GCT GGC GTC ACC AAG AAG TCTGCC CAG AAA GCT CAG AAG GCT AAA TGA ATATTATCCCTAATACCTG Lys Ala Val Asp Lys Lys AlaAla Gly Ala Gly Lys Val Thr LYS Ser Ala Gln LysAla Gln Lys Ala LYS End 440

450

3820

3900

3920

CCACCCCACTCTTAATCAGTGGTGCAAGAACGGTCTCAGAACTGTTTGTTTCAATTGGCCATTTAAGTTTAGTAGTAAAAGACTGGTTAATGATAACAATGCATCGTAAAACCTTCAGA 3940

AGGAAAGGAGAATGTTTTGTGGACCACTTTGGTTTTCTTTTTTGCGTGTGGCAGTTTTAAGTTATTAGTTTTTAAAATCAGTACTTTTTAATGGAAACAACTTGACCAAAAATTTGTCA 4 144006 0

*

4120

4100

CAGAATTTTGAGACCC~AAAAGTTAAATGAGAAACCTGTGTGTTCCTTTGGTCAACACCGAGACATTTAGGTGAAAGACATCTAATTCTGGTTTTACGAATCTGGAAACTTCTTG 4220

4 1 80

4200

AAAATGTAATTCTTGAGTTAACACTTCTGGGTGGAGAATAGGGTTGTTTTCCCCCCACATAATTGGAAGGGGAAGGAATATCATTTAAAGCTATGGGAGGCTTTCTTTGATTACAACAC 4340

4300

4320

TGGAGAGAAATGCAGCATGTTGCTGATTGCCTGTCACTAAAACAGGCCAAAAACTGAGTCCTTGGGTTGCATAGAAAGCTTCATGTTGCTAAACCAATGTTAAGTGAATCTTTGGAAAC 4460

4420

4440

AAAATGTTTCCAAATTACTGGGATGTGCATGTTGAAACGTGGGTTAAAATGACTGGGCAGTGAAAGTTGACTATTTGCCATGACATAAGAAATAAGTGTAGTGGCTAGTGTACACCCTA 4580

4540

4560

TGAGTGGAAGGGTCCATTTTGAAGTCAGTGGAGTAAGCTTTATGCCATTTTGATGGTTTCACAAGTTCTATTGAGTGCTATTCAGAATAGGAACAAGGTTCTAATAGAAAAAGATGGCA 4660

4680

ATTTGAAGTAGCTATAAAATTAGACTAATTACATTGCTTTTCTCCGAC

FIG.4"continued

was 1386 bp long and was identical to thecoding sequence of human EF-la cDNA published by Brands et al. (19). When purified DNA from human leukocytes was cleaved with EcoRI, BglII, or HindIIIand analyzed by Southern hybridization using 32P-labeled E F - l a cDNA, more than 10 distinct bands were observed in each of the restriction enzyme-digested DNA (Fig. 2). Since subsequentanalyses showed that some of the human E F - l a genes contain HindIII, BglII, or EcoRI sites within the gene, the number of different DNA segments hybridizing to EF-la cDNA could not be determined. Cloning of Human EF-la Chromosomal Gene-In order to characterize the chromosomal gene for EF-la, human gene libraries derived from human fetal liver DNA and human placenta DNA were screened using the human EF-la cDNA as a probe. From about 1.5 million plaques, 218 positive clones were obtained. Five of them were plaque-purified, and recombinant XDNAs were characterized by the restriction enzyme

mapping, Southern hybridization, and the partialnucleotide sequence analysis. Although DNA fragments from these five clones hybridized very strongly with human EF-la cDNA they did not contain any introns, but several point mutations, deletions, and insertions indicatingthat they are pseudogenes of human EF-la. Thecomplete nucleotide sequence of one of the clones (XEF8) is shown in Fig. 1, together with that of the human EF-la cDNA. The overall homology between the nucleotide sequences of the XEF-8 pseudogene and human EF-la cDNA was 97%. The pseudogene is terminated by a poly(A) segment of 12 nucleotides and flanked with the sequences of perfect 15-nucleotide direct repeats at both the 5' and 3' ends. To isolate the active chromosomal gene for human EF-la, the 70 clones identified by hybridization with the cDNA were rescreened using the oligonucleotide probe specific for the cDNA sequence. The 18-mer oligonucleotide probe contained the sequence of the nucleotide positions of 1562-1579 of the

5796

Chromosomal Gene Structure for Human Elongation Factor-la

1 2 3 50

40 -587 -434

I

-267

-primer

-184

FIG. 5. Determination of the transcription start point of EFl a mRNA by primer extension.A 32P-labeledprimer wasannealed

with EF-la mRNA and extended by using reverse transcriptase as described under “Experimental Procedures.” The synthesized cDNA waselectrophoresed on an 8% polyacrylamidegel together with sequencing ladders. The sizes of sequencing ladders are indicated on the right.

3’-non-coding regionof the cDNA. This sequence was chosen because the 3‘-non-coding region is known to mutate more rapidly than thecoding sequence(30), and frequent mutations were found in this region of the pseudogenes whichhad been sequenced(Fig. 1). By screening with the oligonucleotide probe five clones gave positive results, and one of them (XEFg 58) was plaque-purified. Southern hybridization analysis of XEFg 58 DNA indicated that the 7-kb EcoRI fragment contained the DNA fragment hybridizing with the EF-la cDNA. The 7-kb EcoRI fragment was subclonedat the EcoRI site of pUC 119to yield plasmid pEFgl. Structure of the Human EF-la ChromosomalGene-A fine restriction enzyme map of the human EF-la chromosomal gene was constructed using pEFg1, and the nucleotide sequence was determined according to the strategy shown in Fig. 3. Fig. 4 presents about 5 kb of the nucleotide sequence containing the human EF-la chromosomal gene. When the human EF-la cDNA sequence was aligned with the exons of the chromosomalgenesequence, both sequences matched completely and revealed the structural organization of the human EF-la gene (Fig. 3). The gene is split by 7 introns, and all of the splice donor and acceptor sites conform to the GT....AG rule (31) for the nucleotides immediately flanking exon borders. The first exon, consisting of 33 nucleotides, is located 943bp upstream from the second exon which contains the ATG initiation codon. Other introns are relatively short and consist of 83-376 bp. The Initiation Site for Transcription-The transcription start point of human EF-la genewas determined by the primer extension method. Poly(A) mRNA wasprepared from human HL-60 cells and hybridized with the excess of 5’labeled24-mer primer, 5”TGTGTTCTGGCGGCAAACCCGTTG-3’, which is complementary to nucleotide positions 584-607 (Fig. 4). After incubation with avian myeloblastosis virus reverse transcriptase, the synthesized cDNA was separated by polyacrylamide gel electrophoresis. As shown in Fig. 5, two bands of 32 and 33 nucleotides were obtained. Since

-124

FIG. 6. In nitro transcription of E F - l a gene. In vitro transcription was carried out as described under “Experimental Procedures,” and the products were separated on a 4% polyacrylamide gel containing 8 M urea. The DNA templates used for in vitro transcription are SmI-digested pAdSmaF DNA ( l a n e 1 ), ApaI-digestedpEFgl ( l a n e 2), and PstI-digested pEFgl ( l a n e 3). As size markers, 32PlabeledHaeIIIfragmentswere runin parallel and sizes of DNA fragments are given in bases. Specific RNA products from the EF-la and adenovirus major late promoters are indicated by arrows on the left margin. ori, origin of electrophoresis.

avian myeloblastosis virus reverse transcriptase can erroneously transcribe the “cap” structure giving one nucleotide longer cDNA (32), we assigned the major initiation site of transcription of human EF-la gene to the C residue at position 576 (Fig. 4). Transcription of EF-la Gene in a Cell-free System-We transcribed the human EF-la gene in a cell-free system derived from HeLa cells. pEFgl DNA was truncated at the PstI and ApaI sites, which are located at 128 and 428 nucleotides, respectively, downstream of the start site determined by the primer extension method. The DNA templates were incubated with the HeLa cell nuclear extract (29) in the presence of [a-32P]UTP, and the run-off RNA products were analyzed by polyacrylamide gel electrophoresis. As shown in Fig. 6, the ApaI-digestedpEFgl DNA gavea transcriptof 428 nucleotides long, while an RNA of 124 nucleotides long was produced by using the PstI-digested pEFgl as a template. These results suggest that the transcription start site determined by primer extension was usedalso as theinitiation site in the cell-free transcription system. To examine the relative strength of the human EF-la gene promoter, pAdSmaF containing theadenovirus major late promoter was digested with SmaI and transcribed in parallel under the same conditions. The analysis of the RNA products (Fig. 6) indicates that the human EF-la promoter is stronger than theadenovirus major late promoter. Both templates, ApaI-digested pEFgl and PstIdigested pEFgl, directed the labeling of additional products

Chromosomal Gene Structure for Human Elongation Factor-la Exon

Exon

2

5797

3

FIKNMITGTSQADCAILI

LAATGNSKAKRYEDIDSAPEEKARGITINTAHVEYETKNRHYAH

DYYKNMITGAAQMDGAILV DYVKh'MITGMQMDGAILV

20

Exon 4 hEFlaA :VAAGVGEFEAGISKNGQTREH aEFla :VGAGVGEFEAFISKNGQTREHALLAYTLGVKQLIVG dEFlaI :DAAGTGEFEAGISKNDQTREiiALLAFTLGVKQLIVG dEF1aII:DAAGTGEFEAGISKNGQTREHALLAFTLGVKQLIVG yEFlaA tufC tufM tufA

:IAGGVGEFEAGISKDGQTREHALLAFTLGVRQLIVA :VSAADGPM-------PQTKEH :VAATDGQM-------PQTREHLLLARQVGVQHIWF :VAATEGPM-------PQTREHILLGRQVGVPYIIVF 120

40

60

80

100

Exon 5 PPYSQKRYEEIVKEVSTYIKKIGYNPDTVAFVPISGWNGDNMLDPSAN~WFKGWKVTRKDGNASGTTLLEALDCI TEPPFSEA~EEIKKEVSAYIKKIDYNPAAVAFVPISGWHGDNMLEASDRLPWYKGWNIERKEGKADGKTLLDALDAI SEPPYSEARYEEIKKEVSSYIKKVGYNPAAVAFVPISGWHGDNMLEPSTNMPWFKGWEVGRKEGNADGKTLVDALDAI TEPPYSEARYEEIKKEVSSYIKKIGYNPASVAFVPISGWHGDNMLEPSEKMPWFKGWSVERKEGKAEGKCLIDALDAI VK--WDESRFPEIVKETSNFIKKVGYNPKTVPFVPISGWNGDNMIEATTNAPWYKGWEKETKAGWKGKTLLEAIDAI ---DSELLELVELEIRETLSNYEFPGDDIPVIPGSA-----LLSVEALTKNPKITKGENK~KILN~DQVDSYI D---DPEMLELVEMEMIIELWEYGFDCDNAPIIMGSA-------------~ALEGRQPEIGEQAIMKLLDAVDEYI D---DEELLELVEMEVRELLSQYDFPGDDTPIVRGSA-------------LKALDGDA-E-WEAKILELAGFLESYI 140 160 180 20

Exon 6 hEFlClA :LPPTRPTDKPLRLPLQDVYKI~GIGTVPVGRVETGVLKPGM"-WTFAPVNVTTEVKSVEMHAEALSEALPGDNVGFNVKNVSVKDVRRGNVAGDS~DPPMEAAGFTA~VIILNHPGQ aEFla :LPPSRPPEKPLRLPLQDVYKIGGIGTVPVGRVETGIIKPGM---IVTFAPANITTEVKSVEMHHESLEQASPGDNVGFNVKNVSVKELRRGYVASDS~P~GSQDFFA~VIVLNHPGQ dEFlaI : L P P A R P T D K A L R L P L Q D V Y K I G G I G T V P V G R V E T G V L ~ G T - - - ~ A P A N I T T E V K S V E M H H E A L Q E A V P G D N V G F ~ K N V S ~ L R R G Y V A G D S ~ P P K G ~ F T A Q V I V L N H P G Q dEFlQII : L P P Q R P T D K P L R L P L Q D V Y K I G G I G T V P V G R V E T G L L K P ~ M - " ~ F A P V N L V T E V K S V E ~ H E A L T E ~ P G D N V G F ~ N V S ~ L ~ G ~ A G D S ~ P P R G ~ F T A ~ V I V L N H P G Q yEFlaA :EQPSRPTDKPLRLPLQDVYKIGGIGTVPVGRVETGVIKPGM---WTFAPAGVTTEVKSVEMHHEQLEQGVPGDNVGFNV~VS~IRRGNVCGDAKNDPPKGCASFNATVIVLNHPGQ tufC :PTPTRDTEKDFLMAIEDVLSITGRGTVATGRVERGTIKVGETVELVGLKDTR-STTITGLEMFQKSLDEALAGDNVGVLLRGIQKNDVERGMVLAKPRTINPH--TKFDSQ~ILTKEEG tufM :PTPERDLNKPFLMPVEDIFSISGRGTWTGRVERGNLKKGEELEIVGHNSTPLKTTVTGIEMFRKELDSAMAGDNAGVLLRGIRRDQLKRCLILASLYILSKEEG tufA :PEPERRIDKPFLLPIEDVFSISGRGTWTGRVERGIIKVGEEVEIVGIKET~-KSTCTGVEMFRKLLDEG~GE~GVLLRGIKREEIERGQVLAKPGTIKPH--TKFESEVYILSKDEG 0 220 240 2 60 280 300 Exon 7 Exon 8 hEFlQA :I S A G - - - - - Y A P V L D C H T A H I A C K G A E L K E K I D R R S G K K L E D G P K F L K S G D M I " V P G K P M C V E S F S D Y P P ~ G R F A V ~ M R Q T V A V G V I ~ ~ K ~ G A G ~ V T K S A Q ~ Q ~ aEFla :ISNG-----YTPVLDCBTAHIACKfAEIKEKCDRRTGKTTEAEPKFIKSGDRAMITLVPSKPLCVERFSDFPP~GRFAVRDMRQTVAVGVIKSVNFKDPTAGKVTKAAEKAG~ dEFlaI :IANG-----YTPVLDCRTAAIACKfAEILEKVDRRSGKTTEENPKFIKSGDAANITLVPSKPLCVEAFSDFPPLGRFAVRDMRQTVAVGVIKSVNFKDPTAGKVTKAAEKAG~K dEFl~II:IANG-----YTPVLDCHTAHIACKFSEIKEKYDRRTGGTTEDGPKAIKSGDAAIVNLVPSKPLCVEAF~EFPPLGRFAVRDMRQTVAVGVIKAVNFKDASGGKVTKAAE~TKGKK yEF1U.A : I S A G - - - - - Y S P V L D C R T A H I A C R F D E L L E K N D ~ S G K K ~ D H P K F ~ S G D ~ V K F V P S ~ M C V E A F S E Y P P L G R F A V ~ M R Q T V A V G V I K S V D - K T E K A A K V T ~ Q ~ tuft :GRIITPFFEGYRPQFYVRTTDVTGKI-------ESFRSDNDNPAQM-VMPGDRI~KVELIQPIAIIEK------G~AIRGGRTVGAGWLSIIQ tufM :GRBSGFGENYRP~MFIRTADVTVV"--------RFPKEVEDHSMQ~GDNVEMECDLIHPTPLEV------GQWNIREGGRTVGTGLITRIIE tufA :GRBTPFFKGYRPQfYFRTTDVTGTI~-----~~-ELPEGVE---M-VHPGDNIKMVVTLIHPIAMDD------GLRFAIREGGRTVGAGWAKLLG 320

340

360

380

FIG. 7. Comparison of the amino acid sequences of EF-la and EF-Tu predicted from the nucleotide sequences. The amino acid sequences of EF-la and EF-Tu from human ( h E F - I d ) (Fig. l ) ,A. salina (aEFla) (20), D. rnelunogaster (dEFlaZ and dEFlaZZ) (21), S. cerevisiue ( y E F l d ) (13-16),chloroplast of Euglenagrucilis (tufC) (371, mitochondria of S. cerevisiue (turn) (17), and E. coli (tufA) (42) are aligned to give maximal homology by introducing several gaps (-). The boxed amino acid sequences indicate the consensus sequences of the phosphoryl and guanine-specific binding sites. The junctions of exon and introns areindicated by V.

(Fig. 6), for example, those of 460 nucleotides (lane 3 ) or longer (lunes 2 and 3 ) ,whose origin has not been established. DISCUSSION

In this paper, we described the isolation and structural analysis of the human EF-la gene. By screening 1.5 X lo6 plaques of human gene library with human EF-la cDNA, we have obtained more than 200 positive clones. Assuming an average size of 15 kb for the insertsin the human gene library and a genome size of 3 X lo9 bp, we can expect one in 2 X lo6 plaques carrying a particular single-copy sequence. This suggests that there may be more than 20 DNA segments homologous to the EF-la gene per human haploid genome, which agrees with the result obtainedby the Southern hybridization analysis of human genomic DNA with the EF-lacDNA (Fig. 2). When5 clones were picked up randomly from these positive plaques, none of them contained introns and their sequences did not match completely with that of the cDNA. It seems, therefore, that most of the DNA segments related

to the EF-la cDNA in the human genome are pseudogenes for EF-la. Thecomplete nucleotide sequence analysis of one of them indicated that it is a full length copy of the spliced mRNA containing the poly(A) segment. So far, the pseudogenes have been reported in several gene families including mouse a-globin (33), mouse immunoglobulin (34), and mouse ribosomal proteins (35, 36). In particular, the gene family of mouse ribosomal protein L32 contains 16 genes, 15 of which are pseudogenes (35). The processed pseudogenes may have arisen by reverse transcription of mRNA and integration of the resultant transcript into the genome of germline cells (34). The presence of the 15-nucleotide perfect repeats framing the EF-la pseudogene (Fig. 1)agrees with this proposal. The EFla pseudogene may occur rather frequently, since the EF-la mRNA is one of the most abundant mRNAs. By using a specific oligonucleotide probe, we could find one gene whichhas introns and the completely identical sequence to thecDNA (Fig. 4). Since E. coli (12), yeast (13-15), and D. melunogaster (21) have two genes for EF-la per genome, it is

5798

Chromosomal Structure Gene

fa lr Human Elongation Factor-la

possible that theremay exist more than one actively expressed gene in the human genome. Furthermore, the human genome would contain the genes for mitochondrial EF-Tu and thesaurin a which was found in Xenopus previtellogenic oocytes as a protein homologous to EF-la(10, 11). So far, eukaryotic EF-la genes have been isolated from S. cereuisiae (13-16), A. salina (20), chloroplast of Euglena gracilis (37), D.melanogaster (21), and human (this report). All EF-la genes, except the yeast EF-la gene, contain introns: 4 in A. salina EF-la gene; 1 in D.melanogaster EF-la1 gene; 3 in the chloroplast EF-la and D.melanogaster EF-la11 gene; and 7 in the human EF-la gene. Some of these introns are within the 5’- or 3’-non-coding regions. When the amino acid sequences predicted by the EF-la genes were aligned, it was found that the four splice junctions are conserved between the A . salina and human EF-la genes (Fig. 7). Furthermore, the position of the 2nd intron of the chloroplast EF-Tu gene is close to the last intron of the human and A . saEina EF-la genes. Examination of the correlation of the exons with the structural domains of E. coli EF-Tu molecules (4, 5) revealed the following properties. The 2nd exon of human EF-la gene corresponds to theN-terminal domain of E. coli EF-Tu, which can becleaved offby a limited trypsin digestion. In this domain, the conserved phosphoryl binding sequence (G1y-XX-X-X-Gly-Lys) (38) is located. The amino acid sequence of the 3rd exon comprises two @-sheets and one a-helix, and contains the second phosphoryl binding site of Asp-X-X-Gly in which the Asp residue can be linked with GDP via Mg+. The 4th exon contains the typical mononucleotide fold of three @-strandsin which the guanine-specific binding site of Asn-Lys-X-Asp is present. Furthermore, the x-ray analysis of E. coli EF-Tu indicates that there are additional domains consisting of amino acids 200-296 and 297-393 at theC terminus of the molecule (4). These domains may correspond to the 6th and 7th exon of the human EF-la gene, respectively. EF-Tu in E. coli and EF-la in yeast are one of the most abundantproteins in these organisms, and 4-6% of total soluble proteins in E. coli and yeast are EF-Tu or EF-la, respectively (2,8). In mammalian cells, EF-la is also present abundantly in almost all kinds of tissues, and theEF-la gene can be regarded as one of the so-called “housekeeping” genes. Using the nuclear extracts from HeLa cells, the promoter of human EF-la gene could direct the in uitro transcription at least %fold more effectively than the adenovirus major late promoter (Fig. 6), which indicates that the promoter of the EF-la gene is a strongpromoter. Like ribosomal protein genes (39), the major site of transcription initiation of the EF-la gene is located at a cytosine residue which is embedded in a tract of consecutive pyrimidines (Fig. 4). At -24 nucleotides upstream of the transcription initiation site, there is a typical “TATA” box which is usually not present in the promoter of housekeeping genes including ribosomal protein genes (39). The Spl binding site (G/TGGGCGGG/AG/AC/T) (40) is repeated three times on the 5“flanking region and five times in the first intron of the EF-la gene. Furthermore, the sequence similar to the AP1 binding site (C/GTGACTC/AA) (41)is located within the first intron. It will be interesting to study whether those sequences are actually involved in the transcriptional regulation of the EF-lagene. REFERENCES 1. Kaziro, Y. (1978) Biochim. Biophys. Acta 605,95-127 2. Arai, K. I, Kawakita, M., and Kaziro, Y. (1972) J. B i d . Chem. 247,7029-7037 3. Arai, K., Clark, B.F. C., Duffy, L., Jones, M. D., Kaziro, Y.,

Laursen, R. A., L’Italien, J., Miller, D. L., Nagartti, S., Nakamura, S., Nielsen, K. M., Petersen, T. E., Takahashi, K., and Wade, M. (1980) Proc. Natl. Acad. Sci. U. S. A. 77,1326-1330 4. Jurnak, F. (1985) Science 230.32-36 5. la Cour, T. F. M., Nyborg, J., Thirup, S., and Clark, B. F. C. (1985) EMBO J. 4 , 2385-2388 6. Nagata, S., Iwasaki, K., and Kaziro, Y. (1977) J. Biochem. (Tokyo) 82, 1633-1646 7. Slobin, L. I., and Moller, W. (1976) Eur. J. Biochem. 6 9 , 351375 8. Thiele, D., Cottrelle, P., Iborra, F., Buhler, J.-M., Sentenac, A., and Formageot, P. (1985) J. Biol. Chem. 260,3084-3088 9. Piechulla, B., and Kuntzel, H.(1983) Eur. J. Biochern. 132,235240 10. Viel, A., Dje, M. K., Mazabraud, A., Denis, H., and le Maire, M. (1987) FEBS Lett. 223,232-236 11. Mattaj, I. W., Coppard, N. J., Brown, R. S., Clark, B. F. C., and De Robertis, E. M. (1987) EMBO J. 6,2409-2413 12. Jaskunas, S.R., Lindahl, L., Nomura, M., and Burgess, R. R. (1975) Nature 257,458-462 13. Nagata, S., Nagashima, K., Tsunetsugu-Yokota, Y., Fujimura, K., Miyazaki, M., and Kaziro, Y. (1984) EMBO J. 3 , 18251830 14. Nagashima, K., Kasai, M., Nagata, S., and Kaziro, Y. (1986) Gene (Amst.) 4 5 , 265-273 15. Cottrelle, P., Thiele, D., Price, V. L., Memet, S., Micouin, J.-Y., Marck, C., Buhler, J.-M., Sentenac, A., and Formageot, P. (1985) J. Biol. Chem. 260,3090-3096 16. Schirmaier, F., and Philippsen, P. (1984) EMBO J. 3,3311-3315 17. Nagata, S., Tsunetsugu-Yokota, Y., Naito, A., and Kaziro, Y. (1983) Proc. Natl. Acad. Sci. U. S. A . 80, 6192-6196 18. Van Hemert, F. J., Amons, R.A., Pluymus, W. J. M., Van Ormondt, H., and Moller, W. (1984) EMBO J. 3 , 1109-1113 19. Brands, J. H. G. M., Maassen, J. A., Van Hemert, F. J., Amons, R., and Moller, W. (1986) Eur. J . Biochem. 155, 167-171 20. Lenstra, J. A., Van Vliet, A,, Arnberg, A. C., Van Hemert, F. J., and Moller, W. (1986) Eur, J. Biochern 155,475-483 21. Hovemann, B., Richter, S., Walldorf, U., and Cziepluch, C. (1988) Nucleic Acids Res. 16,3175-3194 22. Walseth, T. F., and Johnson, R. A. (1979) Biochirn. Biophys. Acta 5 6 2 , 11-31 23. Maniatis, T., Fritsch, E. F., and Sambrook, J. (1982) Molecular Cold Spring Harbor Laboratory, Cbning:A Laboratow Manual, Cold Spring Harbor,NY 24. Okayama, H., and Berg, P. (1983) Mol. Cell. Biol. 3,280-289 25. Messing, J. (1983) Methods Enzymol. 1 0 1 , 20-78 26. Gross-Bellard, M., Ovdet, P., and Chambon, P. (1973) Eur. J. Biochem. 36,32-38 27. Lawn, R. M.,Fritsch, E.F., Parker, R. C., Blake, G., and Maniatis, T. (1978) Cell 15, 1157-1174 28. Matsushime, H., Wang, L-H., and Shibuya, M. (1986) Mol. Cell. Biol. 6 , 3000-3004 29. Dignam, J. D., Lebovitz, R. M., and Roeder, R. G . (1983) Nucleic Acids Res. 11,1475-1489 30. Jones, C. W., and Kafatos, F. C. (1980) Nature 2 8 4 , 635-638 31. Breathnach, R., and Chambon, P. (1981) Annu. Reu. Biochem. 50,349-383 32. Gupta, K. C., and Kingsbury, D. W. (1984) Nucleic Acids Res. 12,3829-3841 33. Nishioka, Y., Leder, A., and Leder, P. (1980) Proc. Natl. Acad. Sei. U. S. A. 77,2806-2809 34. Hollis, G. F., Hieter, P. A., McBride, 0. W., Swan, D., and Leder, P. (1982) Nature 296,321-325 35. Dudov, K. P., and Perry, R. P. (1984) Cell 37,457-468 36. Wiedemann, L. M., and Perry, R. P. (1984) Mol. Cell Biol. 4 , 2518-2528 37. Montandon, P.-E., and Stutz, E. (1983) Nucleic Acids Res. 11, 5877-5892 38. Dever, T.E., Glynias, M. J., and Merrick, W.C. (1987) Proc. Natl. Acad. Sci. U. S. A . 84,1814-1818 39. Dudov,K. P., and Perry, R. P. (1986) Proc.Natl. Acad. Sci. U. S. A . 83,8545-8549 40. Kadonaga, J. T., Jones, K.A., and Tijan, R. (1986) Trends ~ i o c h e mSei. . 11,20-23 41. Lee, W., Mitchell, P., and Tijan, R. (1987) Cell 49, 741-752 42. Yokota, T., Sugisaki, H., Takanami, M., and Kaziro, Y. (1980) Gene (Amst.) 12,25-31