Isolation, Characterization, and Mapping to Chromosome 19 of the ...

3 downloads 0 Views 3MB Size Report
isolated by hybridization to nick translated apo-E gene fragments. Template DNA was ..... Kahl, F., Karathanasis, S. K., and Zannis, V. I. (1982) J. Biol. Chern.
THEJOURNAL OF BIOLOGICAL CHEMISTRY 0 1985 by The American Society of Biological Chemists, Inc.

Vol. 260, No. 10, Issue of May 25, pp. 6240-6247 1985 Printed in L?.S.A.

Isolation, Characterization, andMapping to Chromosome 19 of the Human Apolipoprotein E Gene* (Received for publication, March 5, 1984)

Hriday K. DasS, Joseph McPhersonj, GailA. P. Brunsnll, Sotirios K. Karathanasisg)), and J a n L. BreslowS From the §Metabolism Division and TGenetics Division, Children’s Hospital, the JIDepartment of Pediatrics, Harvard Medical School, Boston, Massachusetts 02115, and the $Laboratory of Biochemical Genetics and Metabolism, Rockefeller University, New York. New York10021

The human apo-E gene has been isolated from a X variation in the apo-E polypeptide in the human population phage library using as a probe the previously reported and post-translational modification of apo-E with carbohyapo-E cDNA clone pE-301. X apo-E was mapped and drate chains containingsialic acid (5-11). Apo-E from differsubcloned, andthe apo-Egene was completely se- ent individuals occurs as six different phenotypes,which from quenced. The DNA sequence was compared with that family studies were shown to be the result of a single genetic of a near full lengthcDNA clone pE-368 and revealed locus for apo-E with three common alleles, designated €4, t3, three introns. The first intron was in the region that and e 2 (5, 6, 12). The most common allele, €3, occurs with a corresponds to the 5’ untranslated region of apo-E frequency of approximately 0.75, whereas the other two alleles mRNA. The second intron interrupted thecodon spec- occur with frequenciesof from 0.08 to 0.15 (5,13-16). Familial ifying amino acid-4 of the apo-E signalpeptide. The third intron interrupted the codon specifying amino type I11 hyperlipoproteinemia, a disease characterized by hyacid 61 of the mature protein. Analysis of the DNA perlipidemia, xanthomatosis, and premature atherosclerosis, is associated in over 90% of cases with homozygosity for the sequence revealed four Alu sequences. Two were in t2 apo-E allele, which onlyoccursin 1-2%of the general opposite orientations in the second intron, and one each occurred in the regions 5’ and 3’ to the apo-E gene. population (5-7,12,17). The polymorphic amino acid residues There were two base differences between the apo-E responsible for the different electrophoretic forms of apo-E have recently beeen determined. The t3 allele specifies Cysgenesequence andthe sequencederivedfromthe cDNA clones. At the codon for amino acid residue 112,112, Arg-145, Lys-146, and Arg-158 (4). The t4 allele differs the apo-E gene contained CGC, specifying Arg, from €3 by having a Arg-112 (4, 10). The €2 allele has been whereas thecDNA contained TGC, specifying Cys. The shown to be heterogeneous. The most common t2 allele has a other base difference was in the area corresponding Cys-158, to but t2 resulting from Cys-145 or aGln-146have the 5’ untranslated region of apo-E mRNA. Apo-E is been observed (4, 10, 11, 18).All the €2 allele gene products commonly polymorphic in the population and the data have been shown to have altered binding to lipoprotein recepsuggest that the genomic clone was derived from the tors which is consistent with this mutation,causing the lipot4 apo-E allele, whereas the cDNA clones were derived protein abnormalities in patients with Type I11 hyperlipoprofrom the €3apo-E allele. S1 nuclease protection and teinemia (11,18, 19). However, most individuals homozygous primer extension experiments allowed the tentative for the €2 alleledo not express hyperlipidemia, and it is clear assignment of the cap site of apo-E mRNA to the A that other environmental, endocrine, or genetic factors are approximately 44 base pairs upstream of the GT that involved in disease expression (5, 13-16, 20-23). Recently, a begins the first intron. The sequence TATAATT was kindred was described containing several siblings who had no identified beginning 33 base pairs upstream of the detectable plasma apo-E. These individuals expressedthe type proposed cap site and is presumably one elementof the apo-E promoter. Finally, the apo-E gene was mapped I11 hyperlipoproteinemia phenotype(24). In previous studies, we have reported the isolation and in the human genome to chromosome 19 through the use ofDNA probes and human-rodent somatic cell characterization of human apo-E cDNA clones (25-27). In thecurrentstudy,these clones were used to isolate and hybrids. characterize the apo-E gene. This information will provide the foundation for studies of normal apo-E gene expression and mutationsaffecting this process. In addition, themapping of apo-E to human chromosome 19 should allow assessment Apo-E is one of the protein constituentsof plasma lipopro- of the relationshipof the apo-Egene to othergenes, including by bind- apolipoprotein genes on this chromosome. tein particles (1).It mediates lipoprotein catabolism ing to high affinity cell surface receptors (2, 3). Mature apoE is a 299-amino acid polypeptide of known primary amino EXPERIMENTALPROCEDURES acid sequence (4). High resolution 2-dimensional polyacrylcDNA Clones Usedas Probes andfor D N A Sequence Comparisonsamide gel electrophoresis of humanplasmaapo-E reveals Two apo-E cDNA clones were used in these studies. The first of these several isoproteins which differ in both size and/or charge (5, pE-301 contains sequences that correspond to the mRNA that codes 6). These isoproteins are the result of both common genetic for apo-Eamino acids 81 through 299, the 145-bp’ 3’ noncoding * The costs of publication of this article were defrayed in part by

region, and a portion of the poly(A) tail (25, 26). The second clone pE-368 corresponds to the mRNA that codes for 60 bp of the apo-E

the payment of page charges. This article must therefore be hereby marked “aduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The abbreviations used are: bp, base pairs; kb, kilobases; Pipes, 1,4-piperazinediethanesulfonicacid.

6240

Human Apo-E Gene

6241

resuspended in 85% formamide, 0.01% xylene cyanol, 0.01% brom5' noncoding region, the apo-E 18 amino acidsignal peptide, the entire 299 amino acids of the mature protein, the3' noncoding region phenol blue, heated a t 65 "C for 10 min, and electrophoresed on a and a portion of the poly(A) tail (27). The sequences for both of these DNA sequencing gel. Primer ExtensionExperiments-A 24-nucleotide long primer comcDNA clones have been previously reported (25-27). Library Screening and Preparation of Cloned DNA-The human plementary to the mRNAsequence coding for the first 8 amino acids genomic library in X vector charon 4A obtained from T . Maniatis of apo-E starting from the methionine was synthesized. The primer was extended (28) was screened with thenick translated (29) insertof apo-E cDNA was labeled at the5' end using ~-~'P-labeled ATP and clonepE-301 according to previouslydescribedmethodology (30). in the presence of reverse transcriptase and poly(A+) RNA isolated Large scale growths of clone X apo-E were carried out in 250 ml of from HepG2 cells following the procedure of Das (41). Chromosomal Mapping Utilizing Somatic Cell Hybrids-The pri1%NZ-amine (an enzymaticdigest of casein; Humko, Norwich, NY), mary hybrid clones used for the DNA mapping panels were derived 0.5% NaC1, 0.5% yeast extract, 0.01% casamino acid, 10 mM MgSO, from fusion of the hypoxanthinephosphorihosyltransferase-deficient broth with the bacterial strain CSH18 used as host. The recombinant phage was precipitated with polyethyleneglycol and purified on CsCl hamster cell E36or mouse cell RAG withwhite blood cells or gradients. Phage DNAwas prepared by phenol extraction and ethanol fibroblasts from 4 unrelated individuals. Two of the white blood cells donors were femalecarriers of different, reciprocal X/19 translocation precipitation. chromosomes. These included: the X/19W translocation t(X;19)(q23Restriction Endonuclease Mapping of Genomic and Cloned DNAChromosomal DNAwas prepared from peripheralblood lymphocytes 23::q13) (42); and the X/19B translocation t(X;19)(ql::?p/q13). In the (31). Restriction endonuclease cleavage and blotting of genomic or latter translocation, the breakpointon the 19 has notbeen precisely defined due to the absence of easily distinguishable landmarks on cloned DNA was done as described (31) using the nick translated this chromosome. The other 2 humandonors were karyotypically insert of apo-E cDNA clone pE-301 as the hybridization probe. Plasmid or M I 3 Replicative Form Preparations andDNA Sequenc- normal males. The hybrid clones have been characterized by isozyme ing-Double digests of the X apo-E were performed and the 5.6-kb and cytogenetic techniques for their human chromosomecompleBamHI-Hind111 andthe 4.0-kb PstI-EcoRIfragments were each ments (43, 44). In addition, the segregation of 19 of the autosomes and the X chromosome in the DNAs used in this study has been cloned into the plasmid vector PUC 9 (32). Recombinant plasmids were transfected into Escherichiacoli strain HB101, and plated on 1 analyzed with clonedDNA probes (45-47). The 19q+ and Xqtranslocation chromosomes were monitored in the hybrid clones by X YT in the presence of ampicillin (75 pg/ml). Single colonies were analyzed for foreign DNA inserts and those containing insertswere isozyme and cytogenetic techniques and with cloned X-chromosome grown in 500 ml of 2 X YT broth supplemented with ampicillin (75 DNAs. The chromosome19 isozyme markers were phosphohexose pg/ml). Plasmid DNAwas prepared by the alkalinelysis method (33), isomerase,lysosomal a-mannosidase, lysosomaldeoxyribonuclease, sonified, and fractionated in a Sepharose 4B column. Fragments of and peptidaseD; those for the X were hypoxanthine phosphoribosyl400-800 bp in lengthwere treated with S1 nuclease and 3' filled with transferase, glucose-6-phosphatedehydrogenase,phosphoglycerate 4 deoxynucleoside triphosphates in the presenceof DNA polymerase kinase, and a-galactosidase (48). DNA from parental andhybrid cells I (large fragment). These fragmentswere cloned in HincII cut repli- was digested to completion with EcoRI, separated by electrophoresis cative form of M13mp7. Clonescontaining gene fragments were on 0.8% agarose gels, and transferred to nitrocellulose. Prehybridiisolated by hybridization to nick translated apo-E gene fragments. zation and hybridization were carried out as previouslydescribed Template DNA wasmade from theseM13 clones and sequencedusing using nick translated plasmid DNA from pE-368 as theprohe (49). the 17-nucleotide long universal primer by the dideoxy method of Sanger (34, 35). Various DNA fragments from these plasmids were RESULTS isolated on low melt agarose and either 3' or5' end labeled for DNA sequence analysisby the Maxam and Gilbert (36) method. The entire Isolation and Characterization of the Cloned Human Apo-E Chromosomal Gene-Screening of lo6 recombinants of the nucleotide sequence was determined on both strands. Analysis for Internal Nucleotide Sequence Homology-Analysis for human genomic library produced2 putative apo-E clones. internal apo-E gene nucleotide sequence homologies was performed One of these clones,designated X apo-E, was selectedfor with the aid of the SEQ computer program (37). Repeated sequences further analysis. Southern blotting analysis of X apo-E was were identified and aligned, and a consensus sequence was generated compatible with similaranalysis of normal chromosomal that contained nucleotides occurring with a frequency of a t least 50% the in the corresponding nucleotide positions of these homologous seg- DNA (data not shown). These data indicate that genomic ments. A comparison was made between the consensus sequence and clone X apo-E contains a 14-kb insert extending from 7 kb each of the 66-bp repeats in the apo-E gene, as well as with the upstream to 5 kb downstream of the 2-kb EcoRI fragment consensus sequence for the apo-A-I 66-bp repeats (38). containing sequences homologous to the pE-301 cDNA clone. SI Nuclease Protection Experiments-The 4.0-kb PstI-EcoRI fragAnalysis of the Apo-E Gene Sequence-Appropriate subfragment was digested withHpaII and 5' end labeled with polynucleotide kinase and [Y-~'P]ATP (34) and the fragments separatedon a 7.5% ments of the region of X apo-E containing the apo-E gene were subcloned into PUC 9, and fragments of these subclones polyacrylamide gel. The 700-bp HpaII fragment located immediately for DNA sequence analysis upstream of the first intron was electroeluted, digested with HaeIII, were used forfurther mapping and and these fragments separated on a 7.5% polyacrylamide gel. The (see Fig. 1).Comparison of the DNA sequence shown in Fig. downstream 125-bp HaeIII-HpaII fragment labeled at the 5' end of 2 with thesequence previouslyobtained from the cDNA clone the noncoding strand was electroeluted, ethanol precipitated, washed pE-368 revealed the presence of three intervening sequences with 70% ethanol, and resuspended in 50 pl of double-distilled H20. (IVS) in the apo-E gene. IVS-1 (757 bp) occurs in the 5' A portion of this material wasusedforDNA sequencing by the noncoding region, IVS-2 (1093 bp) interrupts thecodon specMaxam and Gilbert method (36). Another portion was used for S1 nuclease protection(39).TotalRNA wasisolatedfroma human ifying amino acid -4 of the signal peptideof apo-E, andIVShepatoma cell line known to synthesize apo-E, HepG2cells (27, 40). 3 (581bp) interrupts thecodon specifyingthe +61 amino acid Twenty pg of total HepG2 RNAwas dessicated and thendissolved in of the mature apo-E. All three intervening sequences begin 22 pl of hybridization buffer (80% deionized formamide, 0.05%sodium with GT and end withAG, and conform with the consensus dodecyl sulfate, 1 mM EDTA, and 0.01 M Pipes (pH 6.4)).One pl containing 10,000 cpm of the radiolabeled 125-hp HaeIII-HpaII frag- found around exon-intronsplice junctions (50). Intron 2 conment was then added followed by denaturation at 90 "C for 2 min. tains 2 Alu sequences in opposite orientations each flanked NaCl was added to a final concentration of 0.4 M. Final volume was by shortdirectterminalrepeats (51). Further comparison 25 pl and hybridization was carried out for 1 h a t 65 "C followed by between theapo-E gene sequence and that derivedfrom slow cooling to 42 "C and incubation continued for 24 h. At the end previously characterized cDNA clones (25-27) revealed two of hybridization, replicate incubation mixtures received 300 pl of S1 buffer (0.3 M NaC1,0.03 M NaOAc (pH 4.5), 0.003 M ZnSO.,) contain- base differences (seeFig. 2). The codon for amino acid residue 112 in the apo-E gene specifies CGC, which codes for Arg, ing 100,150, 200, and 250 units of S1 nuclease(NewEngland Nuclear), respectively, and incubated for 1 h at 25 "C. The reaction whereas at thislocation the cDNA clones specify TGC, which was stoppedwithEDTA(finalconcentration10 mM), and, after codes for Cys. The second difference was in the region that ethanolprecipitationandwashwith 70% ethanol,the DNAwas corresponded to the5' untranslated region of apo-E mRNA.

6242

Human Apo-E Gene

/-

’\

F,

B

FtH R

PP PPP

PR

\

IW

\

C

E

FIG. 1. Restriction map of clone X apo-E and sequence analysis strategy. Panel A, X apo-E genomic clone. The long and short arms of the X vector are shown by thick lines. Panels B and C, blow-up of the region containing the apo-E gene. The thick lines correspond to theapo-E exons. As shown in panels C and D, these were derived by a comparison of the chromosomal gene and the sequence present in mature apo-E mRNA as derived from the cDNA clones. Initiation and terminationcodons are indicated. Panel E, DNA sequence analysis strategy. The X apo-E derived DNA fragment bearing the apo-E gene was used for DNA sequence analysis. The thick lines correspond to the apo-E exons. The exons and junctions of the introns were sequenced. Solid arrows indicate the direction and extent of nucleotide sequence determinations using the Maxam and Gilbert method (36). The rest of the sequence was determined by the dideoxy method of Sanger (34,351. Relevant restriction sites areas follows: R, EcoRI; B , BamHI; H3, HindIII; H, HpaII; P , PstI; FI, HinfI; T,TaqI; A , HaeIII; Rs, RsaI; and S, Sau3AI.

The cDNA clone contained a G 59 bp upstream of the ATG that initiates translation, whereas the genomic clone contains a C in this position. The region 3‘ to thecoding sequence shows the presence of one characteristic polyadenylation signal AATAAA (50). In this region, comparison of the apo-E cDNA clones with the X apo-E clone indicates that theA located 20 nucleotides downstream from the last A in the polyadenylation signal is the site at which polyadenylation of the apo-E mRNA occurs. Approximately 150 bp 3’ to the polyadenylation site, an Alu sequence is found which is flanked by a short direct terminal repeat. In the region 5’ to thecoding sequence, S1 nuclease protection experiments were carried out to identify the cap site of apo-E mRNA. A 125-bp DNA fragment just upstream of the first intron was radiolabeled at the 5‘ end of the noncoding strand and annealed to total RNA from a human hepatoma cell line known to synthesize apo-E (27,40).After S1 nuclease digestion, protection of approximately 44 bp of DNAwas achieved (Fig. 3). A comparison of this DNA region with the consensus sequence found around cap sites of other genes suggests that RNA capping occurs at the A 44 bp upstream from the GT thatbegins the first intron. This was confirmed by primer extension experiments in which a 24-nucleotide long primer complementary to the mRNA sequence coding for the first 8 amino acids of apo-E was extended in the presence of poly(A+) RNA isolated from HepG2 cells. The pattern of extension is shown in Fig. 4 and indicates amajor extension product 91 bp long (including the primer). This infers transcription initiation occurs at theA 67 bp upstream from the mRNA initiation codon and is compatible with the results of the S1 nuclease protection experiments. Both the

S1 protection and primer extension experiments indicate some degree of heterogeneity in the transcription initiation site of the apo-E gene. The sequence TATAATT, TATA box, occurs beginning 33 bp upstream of the proposed cap site and conforms with the consensus sequence for one of the elements of eukaryotic gene promoters (50). Unlike other genes, a CAT box sequence was not observed upstream of the TATA box (50). Approximately 370 bp 5’ to the cap site, another Alu sequence was identified. In this case, the Alu sequence was flanked by relatively long direct terminal repeats. Demonstration of Apo-E Gene Internal Nucleotide Sequence Homologies and Their Relationship to Similar Structures in the Apo-I Gene-Computer analysis of the apo-E DNA sequence revealed that thefourth exon was composed largely of eight 66-bp tandemly repeated DNA sequences. The repeated DNA segment encodes amino acids 62-237of apo-E. An homology comparison of the consensus sequence for the repeats (generated as described under “Experimental Procedures”) with each of the repeated segments generated matches at from 51 to 75% of the 66 bases (Fig. 5). This comparison was also done by the SEQ program and 5 of the 8 repeats were highly homologous with the consensus with E values below 0.001. The E values for the other three repeats 1, 7, and 8 were >1, >1, and 0.014, respectively. The consensus sequence for the apo-E gene repeats was compared to the consensus for the apo-A-I gene repeats (38). This showed a 72% homology between the repeats which was highly significant according to theSEQ program,