The barley genes - Springer Link

11 downloads 0 Views 2MB Size Report
Lars Hansen and Penny yon Wettstein-Knowles. Department of Physiology ... and Knauf 1987; Safford et al. 1988; Boom and Cronan 1989; Schmid and Ohl-.
Mol Gen Genet (1991) 229:467-478 002689259100332X © Springer-Verlag 1991

The barley genes Acll and Acl3 encoding acyl carrier proteins I and III are located on different chromosomes Lars Hansen and Penny yon Wettstein-Knowles Department of Physiology, Carlsberg Laboratory, Gamle Carlsberg Vej 10, DK-2500 Copenhagen Valby, Denmark and Institute of Genetics, University of Copenhagen, Oster Farimagsgade 2A, DK-1353 Copenhagen K, Denmark Received March 18, 1991

Summary. Acyl carrier protein (ACP) is an essential cofactor for plant fatty acid synthesis. Three isoforms occur in barley seedling leaves. The genes Acll and Acl3 coding for the predominant ACP I and the minor ACP III, respectively, have been cloned and characterized as has a full-length cDNA for ACP III. Both genes, extending over more than 2.5 kb, have a conserved mosaic structure of four exons and three introns which result in mRNAs of ca. 900 bases. Alignment of the D N A sequences demonstrates that homology is restricted to the two exons coding for the mature protein whereas the remaining segments of the genes including the transit peptide-coding domains lack homology. Southern blot analyses demonstrate that Acll and Acl3 represent single copy genes located on chromosomes 7 and 1, respectively. Primer extension analyses identified multiple transcription start sites in both genes. The promoter regions are remarkably different; that of Acl3 resembles those for mammalian housekeeping genes in having a high G + C content plus three copies of an R N A polymerase II recognition GC element and in lacking correctly positioned TATA boxes. These features are in accordance with the hypothesis that Acll is specifically expressed in leaf tissue whereas Acl3 is a constitutively expressed gene. Key words: Fatty acid synthesis - Nuclear-encoded chloroplast protein - Housekeeping gene - Tissue-specific gene - Multiple transcription sites

Introduction Acyl carrier protein (ACP) is one of the components of plant fatty acid synthetase (FAS), an enzyme complex involved in de novo synthesis of fatty acyl chains in chloroplasts and in proplastids of non-photosynthetic tissues (Stumpf 1987). ACP plays a central role in the Offprint requests to: L. Hansen

cyclic series of reactions by which seven C2-units from malonyl-CoA and one C2-unit from acetyl-CoA are converted into the 16-carbon palmitic acid, as well as in subsequent reactions. Substrates, intermediates and products are all covalently bound by a thioester linkage to the 4'-phosphopantetheine prosthetic group of ACP. An ACP has also been implicated in de novo fatty acid synthesis in plant mitochondria (Chuman and Brody 1989). The plant FAS complex resembles that in Escherichia coli as each of its separable components studied thus far is coded for by a different gene (Hansen 1987; Rose et al. 1987; Scherer and Knauf 1987; Safford et al. 1988; Boom and Cronan 1989; Schmid and Ohlrogge 1990; Siggaard-Andersen et al. 1991). These are localized in the nucleus, and their primary transcripts contain a sequence coding for an amino-terminal, presumed transit peptide. Recent studies in spinach (Spinacia oleracea) have confirmed that cleavage occurs during import into chloroplasts (Fernandez and Lamppa 1990). Evidence has also been presented that the prosthetic group can be added to the pertinent serine residue either in the cytoplasm or in the chloroplast (Elhussein et al. 1988; Post-Beittenmiller etal. 1989b; Fernandez and Lamppa 1990). The role, if any, that is played in fatty acid synthesis by isoforms or isozymes of the various FAS components remains an intriguing question. While those of ACP have received most attention (for example see Battey and Ohlrogge 1990), isozymes of other components such as malonyl-CoA:ACP transacylase (Guerra and Ohlrogge 1986) and/~-ketoacyl ACP synthase (Siggaard-Andersen et al. 1991) have been identified. In barley (Hordeum vulgare) one major and two minor isoforms of ACP are found in seedling leaves. Primary sequence data for two of them isolated from chloroplasts, the major ACP I and the minor ACP II, revealed disparities that could only be accounted for if they were products of different genes (Hoj and Svendsen 1984). Isolation and characterization of cDNA clones confirmed the expression in seedling leaves of ACP I and identified another minor isoform, ACP III (Hansen 1987), which may correspond

468 to the third peak of activity seen in whole leaf extracts (Hoj and Svendsen 1984). In the present study we have cloned, characterized and mapped the genes coding for ACP I and ACP III, and also describe a fulMength cDNA coding for ACP III. These acyI carrier genes have been designated Acll and Acl3 since the symbol Acp is used in many organisms for acid phosphatases (von Wettstein-Knowles 1989; O'Brien 1990). While their conserved mosaic structure is similar to that of two seed-expressed rape ACP genes (Brassica napus) (de Silva et al. 1990) and an Arabidopsis thaliana ACP gene (Post-Beittenmiller et al. 1989a), the ACP gene family of barley differs markedly from those in the dicots. Sequence analyses of the Acll and Acl3 promoters reveals some interesting features in connection with the recent proposal of Schmid and Ohlrogge (1990) that in spinach the more abundant leaf isoform ACP I is a tissue-specific gene product whereas ACP II is determined by a housekeeping gene.

Materials and methods

Biological material. H. vulgare cv. Sval6fs Bonus barley for isolation of DNA from seedling leaves was grown as previously described (Hansen 1987). H. vulgare cv. Betzes, Triticum aestivum cv. Chinese Spring and the disomic addition lines of Betzes barley to Chinese Spring wheat (Islam et al. 1981) were grown under constant light at 22 ° C for 6 days before harvesting. The 25 seeds of each addition line were kindly provided by J.B. Johansen and I. Linde-Laursen. From the resulting seedlings repeated collections of leaf tissue were made every 7-10 days until 2 x 5 g were obtained. Storage was at - 20 ° C. Tissue from the progeny of a karyotyped plant was pooled, as the addition lines are not completely stable (Islam et al. 1981). The barley and wheat seeds were planted in Vermiculite, the addition line seeds in soil (Simpson and von Wettstein-Knowles 1980). R N A was isolated from similar-sized Bonus seedling leaves grown (i) for 6 days in the dark plus 18 h light at 22°C or (ii) for 16 days (18 h light 15° C/6 h dark I0 ° C). Three E. coli strains were used: DH5c~ (Bethesda Research Laboratories), XL1-Blue (Stratagene) and ER1647 (New England Biolabs). Unless specified otherwise all DNA and R N A modifying enzymes were purchased from Boehringer Mannheim. Preparation ofgenomic DNA. Barley and wheat genomic DNA were isolated by grinding leaf tissue under liquid N z and extracting the D N A while shaking gently for 1 h at room temperature in 100 mM TRIS-HC1 pH 8.5, 100mM NaC1, 50mM EDTA, 2% w/v SDS and 10 mg/ml proteinase K. Deproteinization using phenol and chloroform/isoamyl alcohol preceded isopropanol precipitation. After treatment with RNase A, phenol extraction and ethanol precipitation were repeated to obtain DNA suitable for digestion. Southern blot analysis and chromosome mapping. Bonus or Betzes genomic D N A (10 gg) was digested and frac-

tioned in 0.7% agarose before transfer to a nitrocellulose filter (Southern 1975). Radioactive probes were made either by nick translation (Sambrook et al. 1989) or by random priming (Feinberg and Vogelstein 1983). The ACP I probes used were either a 769 bp EcoRI fragment or a 325 bp EcoRI-PvuII 5' end fragment of p A C P l l , and the ACP III probe a 493 bp EcoRI-HincII fragment of pACP1 (Fig. 1 C). The filters were prehybridized in 5 x SSC (Sambrook et al. 1989), 5 x Denhardt's solution (Sambrook etal. 1989), 1% w/v SDS and 100 gg/ml sonicated denatured salmon sperm D N A for ~ 6 h at 65°C and hybridized in 10-15 ml of the prehybridization solution with 1-5 ng/ml (approximately 10 6 dpm/ ml) of the radioactive probe for 12-18 h. The filters were washed with 2 x SSC at room temperature followed by 0.2 x SSC, 1% w/v SDS at 65 ° C and autoradiographed for 3-6 days. Chromosomal localization of the genes was achieved as described above using > 20 gg D N A from Chinese Spring and the addition lines to attain approximately the same hybridization intensity for a single copy sequence as that attained with 10 ~tg Betzes DNA.

Construction of genomic libraries. Size-selected genomic libraries were constructed in the cloning vector 2ZAP II (Stratagene). Two libraries were obtained by complete digestion of Bonus D N A with the restriction enzyme Bg/II, followed by separation of the DNA fragments in a 0.7% preparative agarose gel. Aliquots of electroeluted fractions containing fragments of 2.5-3.0 kb and 5.0-5.5 kb were hybridized to the ACP I and III probes specified above. The genomic DNA fragments were partially filled-in before ligating into XhoI predigested and partially filled-in vector arms according to the recommendations of the manufacturer. The ligated D N A was packaged using Gigapack Gold extracts (Stratagene) and propagated in ER1647. A third library was constructed by digestion of Bonus D N A to completion with the restriction enzyme XbaI. The digested DNA was ligated into XbaI predigested and dephosphorylated vector arms according to the manufacturer's protocol. Following packaging using Gigapack Gold extracts, lambda vectors with inserts up to 10 kb were selected and propagated in ER1647. Screening of the genomie libraries, and subcloning and sequencing of positive clones. Nitrocellulose filter replicas were made of 6 x 104-10 s plaque forming units according to Benton and Davis (1977). The initial screening and purification of positive plaques were performed in ER1647 while the final excision of the plasmid pBluescript S K ( + ) was carried out in the strain XL-1 Blue according to the recommendations of the manufacturer. Duplicate filters were prehybridized and hybridized, as described for Southern blot analysis, using the EcoRIPvulI fragment of pACP11 and the EcoRI-HincII fragment of pACP1 as probes (Fig. 1 C). The filters were rinsed in 2 x SSC and washed with 0.5 x SSC, 1% w/v SDS at 65°C before autoradiography. Positive clones isolated from the two BglII libraries were sequenced in the plasmid pBluescript SK(+). Positive clones from the

469 A

XbaI I

Bglg SacI II

EcoRI

Bgl~

I

I

XbaI I

" ' ' " - . ~ 1000bp

Fig. 1 A-C. Maps of the barley genes Acll and Acl3. A and B show the restriction

I

i' ATTGA

i

PP ~

lVi 1

H

I

B

Xbal

P~'~ HI

hr

2OObp

BglU EcoRl EcoRI

!

I

'

Bgl]I

I1"

"'--... II

I

[ii~i~i~i~ii T i ATG T~A pACP11 I

AI

ACPII

EcoRI

Pvu~

I

10OObp

"--.

CAACT \CATA

C

XbaI

I

I i

EcoRI

ACPm

pACP1

IlI

J

~!!~

ATGi L

IY

200 bp

T

A|A Aj

T&

EcoRI K;nI

i

, HincR

lOObp

XbaI library were mapped by restriction enzyme analysis before a 2 kb XbaI-SacI 5' end fragment of the ACP I clone and a 3 kb J(baI-EcoRI 5' end fragment of the

ACP III clone (Fig. 1 A, B) were subcloned into the vector p G E M - 7 Z f ( + ) (Promega). Both strands of the isolated genomic clones were sequenced using either synthetic primer walking or by making deletions of the clones using E x o I I I nuclease (Henikoff 1984). Templates for sequencing were prepared either as CsCl-purified plasmid D N A (Sambrook et al. 1989) or by making mini preparations of the plasmid (Hattori and Sakaki 1986). The clones were sequenced by the dideoxy chain-termination method (Sanger et al. 1977) as double-stranded plasmid D N A using Sequenase kit (United States Biochemical Co.). Primer extension. Isolation of R N A from seedling leaves

of Bonus was performed using guanidinium thiocyanate, and poly(A) + R N A was isolated by oligo(dT) cellulose chromatography essentially as described in Sambrook et al. (1989). Three 40-mer oligonucleotides served as primers: extI.1, comprising nucleotides 63-102 in A c l l ; extIII.1, comprising nucleotides 85-124 and extIII.2, nucleotides 74-113 both in Acl3. The primers were desalted and purified on a 15% acrylamide-7 M urea gel before end-labelling (Sambrook et al. 1989). The transcription start site was determined by annealing 5 x 10 s dpm labelled primer to 5-10 gg of poly(A) + R N A for 10 rain at 80°C followed by overnight incubation at 37°C (Sambrook et al. 1989). The annealed primers were extended using 200 units Moloney murine leukaemia virus reverse transcriptase (Bethesda Research Laboratories) in 20 gl of the following buffer: 50 m M TRIS-HC1 (pH 8.3), 75 m M KC1, 3 m M MgC12, I m M dithiothreitol, 1 m M each of dGTP, dATP, dTTP, dCTP and 25 units placental RNase inhibitor (Promega). The reaction mix-

enzyme map (above) and the gene structure (below) for Acll and Acl3, respectively. Thin open boxes represent non-transcribed regions, medium size open boxes represent introns and thick boxes represent processed transcripts of which open areas are non-translated regions, dashed areas are transit peptides and black areas are mature peptides with stars representing the position of the prosthetic group. ATTGA and CAACT are putative CCAAT elements, while TATA and CATA are potential TATA elements. Flags represent oriented repeated regions. C Full-length processed transcripts of Acll coding for ACP I (left) and of Acl3 coding for ACP III (right). An A marks the position of a poly(A) tail. Other symbols are as in A and B. At the bottom are restriction maps of two cDNA clones (Hansen 1987) which served as sources of the probes used in the present study

ture was treated with RNase A and RNase T1, extracted with phenol plus chloroform/isoamyl alcohol and ethanol precipitated. One-quarter or one-half of the extension reaction product mixture was analysed on a 6% acrylamide-7 M urea sequencing gel together with sequencing reaction products obtained with the same oligonucleotides. For determination of each transcription start site, several independent experiments were performed. Isolation and characterization o f a full-length c D N A clone f o r ACP III. A full-length sequence for an ACP III

cDNA clone was obtained by screening of a 2ZAP II cDNA library constructed from 6-day-old greening leaves of Bonus barley (Siggaard-Andersen et al. 1991). The 298 bp E c o R I - K p n I fragment from pACPI (Fig. 1 C) was labelled by random priming (see above). Prehybridization and hybridization were carried out as for Southern blot analysis. One positive ACP III clone (pACP31) was obtained and its 653 bp insert was subcloned into pUC18 for sequencing.

Results

Acll and Acl3 are located on different chromosomes The chromosomal location of the two genes was determined using the Chinese Spring wheat Betzes barley addition lines (Islam et al. 1981). These disomic addition lines contain all 42 wheat chromosomes plus one pair of barley chromosomes 1, 2, 3, 4, 6, or 7. An addition line of chromosome 5 is not available as such plants are self-sterile. Southern blot autoradiograms clearly show that the 2.5 kb BgIII band characteristic of Betzes D N A is also present in addition line 7 D N A (Fig. 2A),

470

A

1

2

3

4

CS

6

7

B E kb

W

~

E

o

X

~

W

n x

2

B

1

2

3

4

CS

6

7

B

kb

5.1

n

4.3-

A

B

Fig. 2A and B. Southern blot analyses of genomic DNA from the hexaploid wheat variety Chinese Spring (CS), the wheat barley disomic addition lines (1), (2), (3), (4), (6), (7) and the diploid barley variety Betzes (B). Size markers used are EcoRI-HindIII fragments of 2 DNA. Arrows indicate the positions of the diagnostic Betzes bands. In A BglII digests were probed with the EcoRIPvuII fragment of pACP11 (Fig. 1C). In B EcoRI digests were probed with the EcoRI-HincII fragmentof pACP1 (Fig. 1C)

Fig. 3A and B. Southern blot analyses of Betzes genomic DNA probed with A the EeoRI fragmentof pACPll (Fig. 1C) to detect AcH and B the EcoRI-HincII fragment of pACP1 (Fig. 1C) to detect Acl3. Size markers used are HindIII fragmentsof k DNA

and that the 12 kb EcoRI band of Betzes DNA occurs in addition line 1 DNA (Fig. 2B). This indicates that the Acl! and Acl3 genes are on barley chromosomes 7 and 1, respectively. Three bands are seen in Chinese Spring wheat genomic DNA after probing for Acll homologous sequences (Fig. 2A) and four bands after probing for Acl3 homologous sequences (Fig. 2B). These bands are due to Acll and Acl3 homologous genes in the A, B and D genomes of wheat (Devos et al. 1991). Digestions of the DNA of the wheat, barley and addition lines with XbaI and BamHI thus confirmed the chromosomal location of the genes. Single bands hybridized to the Acl! and Acl3 probes when genomic DNA from barley cultivars Betzes (Figs. 2 and 3) and Bonus (Hansen and von WettsteinKnowles 1989) was cut with the restriction enzymes BamHI, EcoRI, J2baI and BgIII. Cross-hybridization between the probes and the two genes did not occur under the hybridization conditions given. Furthermore the weak bands in Figure 3 B cannot be attributed to crosshybridization with the gene for ACP II (Hansen and Kauppinen 1991). The possibility cannot be excluded that these weak signals represent unknown ACP genes. Support for the proposal that Acll and Acl3 are single copy genes was obtained from an experiment to estimate the gene copy number of Acll. Band intensity after probing Southern blots of Bonus DNA was compared to signals obtained from hybridization to the equivalent of 1, 2, 5 and 10 copies of the gene. Results of this titration imply the presence of 1-3 copies of Acll per haploid genome (data not shown).

A full-length cDNA clone for ACP III reveals a 49-residue transit peptide and multiple polyadenylation sites

Earlier experiments led to the isolation of two partial cDNA clones for ACP III (Hansen 1987; Hansen and von Wettstein-Knowles 1989) which included only 27 amino acids of a potential transit peptide. As a prerequisite for analysis of the barley ACP gene family at the genomic level a full-length cDNA was required. Thus, another cDNA library constructed from greening seedling leaves was screened with the 5' end EcoRI-KpnI fragment of clone pACP1 (Fig. 1 C). One positive clone (pACP31) of 653 bp was isolated and sequenced in from the ends. An open reading frame was identified which started at nucleotide 73 and continued into the sequence coding for the incomplete transit peptide of pACP1. Sequencing in from the 3' end gave a sequence identical to the genomic sequence (see below). These studies revealed that the transit peptide for ACP III starts with Met-Ala, and has a high serine content of 14.3% and is either 48 or 49 residues long (hereafter we will for simplicity assume 49). Since the mature ACP III protein has neither been sequenced nor its subcellular location established, the length of the transit peptide is speculation based upon the known cleavage sites for ACPs I and II in barley and spinach (Hoj and Svendsen 1983, 1984; Kuo and Ohlrogge 1984; Ohlrogge and Kuo 1985). PreACP III thus has an amino-terminal region with all the attributes of a chloroplast transit peptide (von Heijne et al. 1989). Alignments of the 3' end non-coding regions of the two

471 previously reported ACP III clones pACP1 (Hansen 1987) and pACP17 (Hansen and von Wettstein-Knowles 1989) and the clone pACP31 reported in this paper reveals that the poly(A) tract is located at three different positions (Figs. 1 C and 5). A putative polyadenytation signal AATAAG precedes the two 5' sites by 61 and 57 bases, respectively, while the sequence AATATG occurs 30 bases upstream from the third site. Genomic clones for Acll and Acl3

Barley ACP genes were predicted to be relatively small based on Northern blot analyses (Hansen 1987), Southern blot analyses (Hansen and von Wettstein-Knowles 1989) and the finding that an Arabidopsis ACP gene is smaller than 1.5 kb (Post-Beittenmiller et al. 1989a). Therefore, size-selected libraries were made from BgIIIdigested D N A in which 2.5 kb and 5.5 kb fragments hybridized to Acli and Acl3 probes, respectively. Initial screening resulted in three positive clones for Acll and two for Acl3. The five inserts were characterized by restriction enzyme digests and two, pgLHl:Acll and pgLH2:Acl3, were sequenced. These studies demonstrated that both clones were incomplete at the 5' end (see BglII sites in the maps of Fig. 1 A and B). As Southern analysis of XbaI digested Bonus D N A had demonstrated that Acll and Acl3 are located on fragments smaller than 10 kb (Fig. 3), a third size-selected library was constructed from complete XbaI digested Bonus DNA ligated into 2ZAP II. Two positive clones encoding ACP I and two clones encoding ACP III were obtained. One of each, named pgLH3 : Acll and pgLH4: Acl3, was further characterized and the missing 5' end fragments of the two genes were subcloned and sequenced. Figure 4 presents the sequenced parts of clones pgLHl:AcII and pgLH3:Acll. Segments therein are completely homologous to the cDNA clone p A C P l l (Hansen 1987), thus establishing that these fragments of Bonus barley D N A contain the gene Acll coding for ACP I. The genomic sequence encompassing the 3' end of p A C P l l (Fig. 4) includes an EcoRI site (position 2522). This resolves the problem of why p A C P l l and 18 other ACP I cDNA clones in the same 2gtl 1 library end at this site (Hansen and von Wettstein-Knowles 1989). During construction of this library EcoRI sites were not protected before cleavage of the linkers. Recently an ACP I cDNA clone carrying a poly(A) tail was isolated from another barley leaf cDNA library. The latter clone (pSKcl) extends 77 bp downstream from the EcoRI site before the poly(A) tract starts at position 2598. No polyadenylation signal that resembles the consensus sequence can be found, but degenerated sequences such as A A T G G A at position 2560 or AAGAGT at position 2579 may serve this function. Figure 5 presents the sequenced parts of clones pgLH2:Acl3 and pgLH4:Acl3. A perfect homology to the previously published cDNA clone pACP1 coding for ACP III (Hansen 1987) is found in the regions 13341415 and 2054-2176 in Acl3. However, in the region located beyond 2351 the following difference is found.

The genomic sequence starting at 2390 reads AGCGTGGAG whereas the corresponding cDNA sequence reads ACGTGGCAG. The latter is missing a G as indicated (2391) which is compensated for by a C five bases further on. Resequencing of the cDNA region confirmed the difference. Regardless of whether the amino acids are deduced from the cDNA sequence (Thr-Trp-Gln) or from the genomic sequence (Ser-Val-Glu), two of them are different from the corresponding ACP I sequence of Thr-Val-Asp. Despite this small discrepancy we conclude that the clones pgLH2: Acl3 and pgLH4: Acl3 contain the gene coding for ACP III. Both pACP1 and the second ACP III (pACP17) clone isolated from the same 2gtll library, described above, were incomplete at the 5' end. Not surprisingly the genomic sequence at this position (1328) has an EcoRI site. Acll and Acl3 have a conserved mosaic structure A comparison of cDNA and genomic sequences (Figs. 4 and 5) establishes that both genes comprise four exons and three introns as illustrated in Fig. 1 A and B. Their positions and relative lengths are conserved (Fig. 1). Intron I occurs approximately one-third into the transit peptide coding region, in ACP I between amino acids 23 and 24 in ACP III between residues 16 and 17. Intron 2 in Acl3 is either at the cleavage site of the transit peptide or displaced one codon into the mature protein as in AclI. Of the amino acids spanning intron 3, 18 out of 19 are conserved in all plant ACPs. Intron 3 in both genes starts 8 codons into this block which lies three codons to the carboxy-terminal side of the prosthetic group binding serine (see Figs. 1, 4 and 5). None of the introns in either gene split an amino acid codon. Border sequences at the intron/exon junction sites resemble the consensus sequence suggested for monocots (Hanley and Schuler 1988). The only segments in the two genes showing homology are found in exon III (positions 1947-2069 in Acli and 2054-2176 in AcI3) and exon IV (positions 21562281 in Acll and 2351-2476 in Acl3). At present one can only speculate about the origin and/or function of the repeated regions in the 3' ends of introns 1 and 2 in both genes. This applies also to the long direct repeat having 101 of 105 identical bases downstream from Acll at positions 290~3013 and 3014-3120 (Fig. 4). Two putative TATA elements are active in Acll and Acl3 resulting in a similar transcription initiation pattern Acll. Three primer extension reactions using oligonucleotide ext.I.1 (Fig. 6D) were carried out with R N A from barley seedlings greened for 12 or 18 h to identify the transcription initiation sites. A major extension product (Fig. 6A and D) + 1 starts 17 bp upstream from the 5' end of the longest cDNA clone p A C P l l , and a minor one at +7. A TATA box is located 32 and 39 bp, respectively, upstream from these two sites. A third transcription start site is located at position - 4 7

472 -531 -411 -291 -171 -51 TTT~TCTCCiAT~TCCIC~-~ATTG CCA~CAGTGGTTG~TAGCAGCTC~ ~GAGC-----~GGAC CTC'--'--~GAGCAGCGCACGACTCCACCTACCAGCCGGCCTCTTCCCACCGCCCCCAAATT

70

CTACCGAGCAGCTCAGCCGGCC~CCCATGGCGCACTGCCTCGCCGC~GTTTCCTCCTT~TCGCCGTCC~CCGTCGGCC~GCGGCT~TC~AGCCAGGTA~CGTTCTAGG~CTGACCTCTG A H C L A A V S S F S P S A V R R R L S S Q

190

TCTTGTCACAGATTTGGTC~TTTGCATAGTTTTAGTTTGCGATTCATGAGTTTCCTAGG~TGTTCCTGT~TGTTTCTTC~AAGTTCGTG~TTGATTTGGiCG~TACTT~G~AAAAAC

310

TGAATACTCTTTCCTGGTTTGATTC~TAGTTCACTGAGTAGCATAAACATGCTTCCTC+TTTGCTATC~TGG~GATGTTT~TCATATGC~TGTATTTTCACAATAAGTGTTCAAAAAAT

430

ACCCCATAAATTGTACATATTGAGCACATCCTGAGATGAGTTCATATGTTCAGCC~ATAiATTCATTTA~GCGTAGAGAGAAAT~GTACATTTGAACTAGTATAGTATAGTTAG~TAG

550

GCACAC~GCTTC~TTTGTTACCTATGAGGTT~TTACCCTTAGACTGGTGGGATTTTiTGCTTATTT~CACATAAAAATTTGCTGGTACTATATTTACTAGCAGATGTTGAC~TTTT

670

ATTTACTTTAGTATACAAACTTCAAAGATTCCAAAATCGCATCGTTT~TGT~CCGGC~TTGC~CTG~ATCTGGATGTCTAGTAGGGCTTATTTTAAAATTTTGAGTTTAGTATTCT~

790

ATGTAGCCT~GTTCAG~GATGCAAAGCTCTAGG~GCCA~CC~GCACACCACATGCT~CAGATCTTG+TG~TTAGATTGACAGTTCTTTACC~TGCGTTGCTCTCCACATCGATGi

910

TGATGCGTTA~CCAAATAGAAAATGCCGGCCTTTGTTTTATAAGT~CATATTGCCAT~CTTCCAGTCiGA~TTTCCGGAAT~CATTTC~AGAGGTATTTTTGGGTT~CTTCAACA~ 1030 GATTAGTTTC~ATGTCAGGCCTGGTATTTGAG~GCTTAGTTGACGCATGTTGCATTCCG~TCCTTGTGT~TTGCAC~TGTCTAGGTTAAATTTCGTTG~TCAACTGCACATTATT~T 1150 TTCAAACAGiTCATGCATTTTTTTGCTGTAGTACTTTTTTTCTGTGGTCTGTGAGTTTG~CAGTTCAGiTAT~GAGAAAATTTATTGGTT~ATT~CAGGTGGCTAATGTAGTTTCGAG~ 1270 ~ V A N V V S S . . . . . . . . CGGAGCTCAGTTTCTTTCCATAGCCGGCAGA~GAGC~TTGTATCCATCAGTTC~GACCGAGTTCGCTTCGGTTT~GATTTGTTGTGCGGTAGGCATACGTTTTCTTCTCTGTC~TGA 1390 R S S V S F H S R Q M S F V S I S S R P S S L R F K I C C A ~

,

,

.

CGTGG~TGT6GC~TAGTA~ATGTTTGCT~CCATGCTTT~ATGATATCT~TATTT~T~T~CGAACCTCT~TACTTCTAT~GGT~G~CAA6TGATTCACT~AGG~T~TTAiTGGC~AGCA~ 1510 CATAGATGGiGGGCTAGGC~GCCTGGGTT~GATCCAG~TCTTATCTT~TAC~CTG~TGTCTAGTT~CTCCT~GT~TCCATCATT~TTTT~TAT~TTCTATATT~AAATATGT~

163o

GC~GTTTG~GATTAAAAC6GT~GGATT6CATTGGCAG~TTAAAAT~TGTAAACAT~TTTGAGGTT~AAATAGGCG~T~TTGTT~C~GGACAGiAAAGAAAGC~TTAAATATG~

1750 1870 1990

A

M

G

E

A

Q

A

K

K

E

T

V

D

K

V

C

M

I

V

K

K

GCAGCTGGC~GTCCCTGAT~GTACACCTGiCACAGCGGA~TCGAAAT~C~C~G~CTTG~TGCCGACTC~CTAGACACG~TAAATCCGC~ACTCCCCAC~CCGCAGAAAiGGCA~TAGC~ 2110 Q L A V P D G T P V T A E S K F S E n G A m (~) L D T CGAAAGCAG~GCCATTCAT~CCAT~CAC~GATTATGT~TGTAGGTTG~GATTG~GAT~GGCCTCGAG~AAGAGTTC~CATCACTGTiGATGAAAC~GCGCGCAGG~CATTGC~C~ V E I V M G L E E E F N I T V D E T S A Q D I A T

2230

GTGCAGGAT~CAGCG~CCiTATCGAG~CTTG~GACA~AGAAGACCG+AT~GCGCCiAGGTTCATT~GACAGCTTT~G~GGCAAA~GTGTCATCT~GATCTGTGG~CTTCTGATTi V Q D A A N L I E K L V T K T A ~

2350

.~ GGTTAATTACCCAAAGGCTTCATAAAC~GCTAGGACCATGT~CT~GTTCGAATTCATATGTGGTGTACCTATTGTTATAGAAAAGCT~TGGATCTATG~GATTGA~GAGTCGTTG~

2470 2590

AGTTTTT+CTTTTTGCATATCTGAG~GTTATACAGCTGTG~TATATiATTTCTTTTCTTTCATACCACAGCTATGACCCATTTCTC~TAAAAAAGGAGGGTCTCTG~TTC.TTGGTTC 2710 CACCAT~GGGCTTGTT~AGTTGGCAGGGAAGTAAAGGGGATTTGACAG~GATCGAGAG~TAAAATCCCT~C~GTCAAAATCCCCT~C~TCCTCCCC~TCCCCTiGGGAG~GG~

2830

TT~CCG~C~GGCCTA&ATTTTCCCTTCTTTCATATGATCAGTCCCTiGCTAGATTATTCTCGCGTGCATCATTATGGTGTTGTCTT~CTG~CTTCAGAG~TC~iCACATTATA~

2950

~ACCCAAAATAGAAAAG~TCAAGCACATCTACTATCCCCCCTCG~CT~TTTGGG~T~TATGGTGTTATGTTGCTG~CTTCAGA6~TCAATCACATTACAG~A~AAAATAG~

3070

AAAG~TC~GCACATCTACCATCCCCCCTCG~CT~TTTGGGCTCA~T~CCGT~GTTAGC~CACCTAGGGCATTTTAGCCCTT~TCCCCTGCAAAA

3172

Fig. 4. Acll is included in a 3822 bp fragment of Bonus barley DNA. Nucleotide + 1 designates the 5' base in the primary unprocessed transcript indicated by the enclosed part of the sequence. Small boxes upstream enclose putative transcription regulatory elements, vertical small arrows indicate the three transcription start positions, horizontal arrows identify repeated regions, the black triangles depict the ends of the cDNA clone pACPll, the small

and its presumed T A T A element can be found 51 bp further upstream. The sequences surrounding the three transcription start sites and the two T A T A boxes are aligned with the plant consensus sequence in Table 1. Minor extension products around positions + 1 and - 2 2 are not always seen (Fig. 6 A). Two different primers (extIII.1 and extIII.2, Fig. 6 D ) have been used to locate the transcription start of A c l 3 . The start positions + 1 and + 7 are located

Acl3.

black box indicates the stop codon, an asterisk denotes the transit peptide cleavage site and putative polyadenylation signals are underlined. In the deduced amino acid sequence for preACP I, the serine residue carrying the prosthetic group is enclosed by a circle.

The deduced amino acid sequence is given in the one-letter code and the gaps correspond to the introns

31 and 38 bp downstream from a potential T A T A box

(CATA), and the third start position is at - 56, 43 bases downstream from another putative TATA box (GACA) (Fig. 6B and C). Whether or not the designated sequences have a T A T A function is highly questionable. The - 5 6 site has been detected in two independent experiments and is the only signal observed using m R N A from 18 h greened seedlings (Fig. 6B). All three transcription starts are detectable in R N A isolated from seedlings grown under alternating temperatures and

473 TCTAGAGGAAAAACTGCGTG•TATAGGGTCGAGCAGCACGTGGcAAAGTTGTGTCTcAAAAACCATCAATACCTCTAGTATATATAGGAGGGAGGGAGGGGAGGAGGCAGCCTCAAACCC

-1381

TcAAGGTTTGGCcGAAATTGGAGGTGGAGGAGTCCTACTCCAATCCTACTTGGAGTAGGATTCcACcTTCCCACTTGGAAACTCTTTCCACCTTGTGTTTTTTCCTTCTCAAACCTTATG

-1261

GGCCTTAGTGAGAAcTTATTCCAGCCCACTAGGGGCTGGTTTATCTCTTCcCATAGCcCACGAGACCCCCTGGGTCGTTACACCcCTCCCGATGGTCCCCGGCACCCCTCCCGGCACTCC • .

-1141

TTGTACACTACCGATAAGCCCGAAACTTTTCcGGTGACCAAAACAGGACTTCCTATATATCAATCTTTACCTcCGGACCATTCCGGAGCTCCT~GTGATGTCCAGGATCTCATCCGGGAC

-1021

TACTAGCAACcTTCAGTAAC~TCGTATAAcAATTCCCTATAATCcTAGCGTCAT~GAAcCTTAAGTCTGTAGACCCTACGGGTTCGGGAATCATGTAGACATGACC~AGACACTCTTCGG

-901

CCAATAACGGACATATTTcAAGAAAAACAAcAAcGAAGACAAAGAAAAGATATTAGTcCGTCTAGTAGGAGGAGTTATGGAAGTTGAAcATcTTTTAAAATGTcTAATCGTGTTTTTTTT

-781

TTAATcTTGTAcTCGTAGTGAAGAATTGATAAGAAAATATGCATTGAAATGCATTTGATAAAACATAAATAACCTGCAAATTcAAAAAGAAAATATATGTGATAAAAAATGAATGATGGT

-661

CGACCG•GAGAACAAGATTTGGGTTGA•ACGTTAAATGTACGACAGATCCATTTATAAA%GGAACGAACATTcCGCTTCAAACGAACAAAAAA•GGACAAAACACGCGTCCGTTTGGATT ~ ~. GcGCGGGTGGAATTGCTTTTAGGATTACACGAAGACTTTCACCTATATTAATATATTCCqAGAAAAAGGAAGGTACGCACGATCACTG~AGCACAATCTCGTTTTCTGTTCACGAGATTA

-541 -421

GT~T~cc~GA~AcAGAc~TCA~A~"u`cT~c~T~AccTcT~T~TCAA~TTT~TTTTT~TTTT~TT~G~GAA~cAG~cT~T~A~0~GC~AT~c~cAc~

.3Ol

c?G???~CAA~AC~A??A???cC~????GcCG~cCGG?GAGA?AAAAA~??GA?GGG~?A?AGAA~??AGAAACAA~?AAAAc??GA¢~?cGGAAAG???G~GG~A~AGAGGGG??G

-181

~T~A~cAc~TGcaT~T~TA~GrcATT~TAcTc~i~6-%~T~GAGc&~ArG~cGGGGcG~G~Acc~-~G~TGTcGc~cTTcGTG~c6~cc~TcGA ~ ,---. .~ ~ •

-~

¢c~c~Tcc~T~c~c~T~cT~cc~c~T~cT~cc~Tc~A~T~cc~Tcc~r~ccAA~¢~cAA~cc~c~Ac~cAc~ccc~c~cA~A~cc~cc~cc~Tc

180

M A S I A

S A V S F A K P V K

GTccGTAC~ATGGATccAT~TG~cTG~TGGGGTT~TAGcATTcCTGcTGcTcGAT~cGTGcT~GTccG~GG~GAT~TAGG~GTGA~cAGTAGGGA~TGTTGGAGG~-z~cA~TGTTT~ 300 TcTAGTAcTATTcAGGAAcTGcAcTATG~GcGAATTTAGTGGTAATTcGGGTTTAGcGAcTGTT~TTTTTATTTTTTGG~A~TGcGGGGAGGAc~TGATTTGT~cTT~TGTGG~TTA~

420

CAAGCcATTTcA~AGGTGGAcTTGGGCAGATcTACTTTTcTGcGAGGTTGATTGGTAGGGTG~AGAcA~GTTGTTA~TACTTTGATGTTTG~AcA~TGATGAACAAGAAAGACGAAGAG~

540

AAGTTACTCA~GTTAAGAGGcT~TGAA~TGTTAAGAGGcTTAGATcGcTG~ATGT~cAGTTAAAG~GTGTGAcTTAATGGTTG~TATGGTGGAAcTAGAGG~AGc~GAGAAAGAATAGT~

660

~TAG~A~TTA~TTTT~TAGTAGAATTATT~AATTGGTATTATGATATGTTTAATTGTCT~TGTGTGGTGGTTGTATTG~AATTTTGGATGCTAGTA~TA~GTTGTTG~TTT~CATGcT~ 780 !AGAGGTCcc/~AGTTGGT~TTAcATATAAATcTTcTTTTcGTATCTT~C~TTT~cTTATTAGGAGACAcTTTcTTGGcc~TcTAGTTGGTTAGTGCA~TTGTAGTTC~GTcAcTTAT~

900

iT•GTTGTAAATTC•GGGTACTA•AAAcAATTATTTATTTTcT•TTGGATGTTCAcATA•T•TGT•T•ATT•AAATTAATTGATGcTG•TTTAGTAAAGTTTTATTAAAG•AGcGTcAAT# 1020 AATTT~ATcA~A~G~GTAACATTTTTTTT~T~cAT~GTAAAATTTATTYcT~TTA~T~TCTTC~T~C~AATATTcTTTGTA~G~TcAc~TA~c~A~TT~AA~CTGcA~TG~ 1140 TAGACTTGTGCTATG~GGGATA~TGAATT~CCAGA~GAATGAAA~T~TTGAGAAGAAT~TTTGCCCTAGCATGT~AAAATTAT~TT~CAT~A~TAAA~AGTGAATAAAG~GCACTG~A~

1260

TATTCTTTA~TACATAAGTT~T~T~TGCCTCAAGATG~TTGTTCCTTGAATTCCAGGCAATCAACA~GAATTCACTCTCTTTCTCCGGTGCAAGGAGGGGCAA~GCATTTCT~CGCCTG~ A I N T N S L S F $ G A R R G N A F L R L

1380

AGCCAGTGC~AATGAGATTiGCTGTTTGC~GCTCAGTGA6CTTTATCCCiTCTCT~CAT6CATAATATAiCTGAGAAGTiGCTT~ACTG~GCTAAAGGC~GTGATAT~iTCTGTAGAGi Q e V e M R F A V C C S

1500

TCCTAAATTiATAATTTGTiCTACAAGTT~TGTTTACTTiTGAACAAAT~ATTTT~AG~TTGCGCAAC~GTCACAACAiTGGGCC~TGiAAAAAATGT~CTA~TGGTC~GGAATTACCi

1620

~TGTGTTCCi~CAC~C~AC~AGCG~TTAG6CCTAT~CA~TG~A~CATCiC~AACACGiAATCATAT~6~TAA~AAA~CATAAGAAA~AA~TAT~A~AAA~AAGAGiATGTG~AG~

1740

TCTAAA~TT~AT~ATTTGT~GCTCAT~CAiG~ATGTGA~iGGGTTAGCA~CT~TTTTTT6CCTATGACC~CGTGCCAAG6TTCTATGTT~CTAG~TAC~TACAA~TT~CTATCTCTCi

1860

TCATTAAAT~CAGTGCCAC~TAAGCA~~GCCTATGA~.CCGACTTGC~TTGGCACAT!CCTTATAGT#CCGCATTTA!AGTTGCTTG~TACAACT~TCGGCCACT.~&AGAGTGCC)

1980

~ATTTATAGTTGCTTG~TACAACTTTGCAAGTCCTTCATAACT~ACAAGTTCTTTCTGAATATGTGCCAACAG~CAAA~CAGGATACAGT~GAGAAGGTTTGTGAAATTGTTAAGAAGCA 2100 ~ A K Q D T V E K V C E I V K K Q ~TTGcT~T~cTGAAGG~TGAAGTTT~TG~TA~cA~AAGTT~TcT~Ac~TTGGAG~TGATTcA~T~GA~A~GGTA~cTAT~cT~A~T~TTT~GA~GATGTcAT~6ATGc~AA~A~ L A V P E ~ T E V C G T T K F S D L G A D (~L D T

2220

G~A~TAA~TCAA~TGG~AAATA~TC~A~AATc~AA~T~AATTGTA~TTTTGc~AT6TTTTTTTG~GTTAG~ATA~AATTTGGGG~AAGTT~AG~ATGG~TcAT~TGT~ATGcT~

2340

~T~TA~GcA~TTGAGATT~TGATGGG~c~TGAGGAGGA~TT~AGA~c~cGTGGAG~GAcGAGcG~cAAGcAAT~cGAcAGT~G~AGATG~G~ACG~TcATc~AcAAGcT~ 2460 V E I V M G L E E E F Q I S V E E T S A Q A I A T V E D A A T L I D K L V ~TcTGcGAA~T~AT~T~A~GGGAGGTcG~A~GGAATGA~TGGA~TcG~AAG~TAGTA~?ccTAccT?6GGTTGATc~GTTGT?~G~TATTATAT~TGGG?TGGT~ATTTTcTcA6 SAKSSm G ~ . . . . . . . v v . . . . ACTATTGT Tc~TG~TGTGT~TAATAA~ATAATTGGT~AAGTATGTAAGcTGTTGAc~ATcTAAGTTc~TAGTTTTcGTGCGGcTGTGGT~cATGTTGATGTA~GCAT~TTTTGT~c~C

2580 2700

A~`~``~G~G`~A-TA~TGGGTAc`~-GGCA-t`~CTTCATcATC`~ATGATG~`~c~Gr~AT~TGrC~T~T~AG~GA~c~GCA~CAcAG~T~rG~AGGAGCA~6AcGc~`~ccT 2820 ACAAA~CCG~C~CC~CAAACG~G~GTAGA~G~G~G~GC~C~AA~CAAC~CCAA~G~G~CGGCA~G~CCG~CcCG~C~G~G~G~G~CG~G~

2940

ATCAGACGGACACGAACATTCGTTTCCATTTGGGTCAGC~CATGTGTTC~ACGCTGGCC6TGTATGGTA~GAGAAAAAA~ATGAACGAAiATTTAAAAT~W~ACAACAiTAATTAAAT~

3060

TTAAAGCTGGCCATGAAGGCGAACCAACGAGCCCAGTTG~GCGAGTTGGiTGCGTAGGC~GGGTCACGG~GCTACTCCG~CGTCAGGAGiGCCCGCCGGiGCCGCACCTiCTTCGCGTG6

3180

GTCCTAGCCGACCGCGACGCCGTCGACACTGGGATCCTC~TTGTATCCA6GTGTCAGTC6TG~GACAGG~TAGCGTCGG~GTACGGAAG~GGCACACGGiGTTGCCAGT~CCACTCTGC~

3300

TOGTGCACTAGCATGTTCACGCGTTCTCTC

3330

AC~ is included in underlined sequence in the

a 4830 bp ~ a g m e n t of Bonus barley D N A . In addition to symbols explained in the legend to Fig. 4, t h e promoter region represents homology to a - 5 0 0 region in the barley aleurain gene (Whittier et al. 1987) and contains an inverted repeat underlined by hor~ontal arrows. Open triang~srepresent two additional polyadenylation s i t e s

Fig. 5.

photoperiods. The sequences encompassing these regions are compiled together with consensus sequences in Table 1. In several of the primer extension autoradiograms one or two additional bands are visible in the

higher molecular weight region of the sequencing gels. These observations imply that transcription initiation sites resulting in longer leader sequences (>_>200 bp) may be active in the Acl3 gene. This suggestion is buoyed

474

A

AGCT

AGCT

w--s

by the presence of several potential T A T A boxes, for example at positions - 239, - 313, - 489 and - 497. Further study of the A c l l and Acl3 promoter regions of 650 and 1500 bp, respectively, combined with searching the Gen Bank D N A Data Base revealed several noteworthy items:

B

-47--~ A c l l . At position - 158 a putative inverted C C A A T element (ATTGA) is found (Figs. 4 and 6D). This motif is located 111 bp upstream from the transcription start at - 4 7 and 158 bp upstream from that at + 1.

-56"-,-

Acl3. A presumed C C A A T box (CCAACT) is found (Figs. 5 and 6 D) 139 bp upstream from the transcription start at + 1 and 83 bp upstream from that at - 56. Three motifs, two of which overlap, and with high homology to a G C element are found in Acl3 at positions - 9 6 , - I l I and - 116 (Fig. 5, Table 2). This G C element is a recognition motif for the R N A polymerase II transcription factor Spl active in mammalian systems (Mitchell and Tjian 1989). A stretch of 65 bp in the - 5 0 0 region has 74% homology to an element in a corresponding - 5 0 0 region of the promoter for a barley aleurain gene (Ale) expressed in seeds (Whittier et al. 1987). The possibility remains that this common element may eventually be correlated with a seed specific expression system. Both - 5 0 0 elements have a 9 bp inverted repeat separated by 20 bp (Fig. 5, Table 2).

i

+I ÷I-~

+7

+7--

Discussion

D -157

Acll AcI3

-200 T G G C A T G C C A T G T C A C A T G T A T A T C C A T T T T G C A C A T C C A C G A T T G A G A T -200 TTTGGGCCAGACAGGCCTTCGTCACCACCTGCATTTGAATAGAAGTCATTACT

-9S

ATC CCTGGAGATATACTTTGATTTTTTTCTTC CTT C CATTTTTCTCTC G G A G T A T A G T G C A T C A G CGT

-139

-99

-47

-32

TC TTTTGGACAGTTTATCCTAAGCC TATCTATTTT~xTCTC CTATCTC C A C ~ x T A A T T G C CACCAOTG GCTTCGTGGACC CCGCCGTCGATACTC CCGATGCGCCATCAC T C CCC C G c G C A T ~ A A T G T G G C G A G G -56 +I

-31

+7

GTTGGTAGCAGCTCCAGAG c c A A G G A C CTCCCGAGCAG CGCACGACT CCAC CTACCAGC CGGC CTCTTC

The original inference that two ACP isoforms coexist in spinach leaves (Matsumura and Stumpf 1968) was validated in 1985 by Ohlrogge and Kuo. By that time two isoforms had also been identified in barley leaves (Hoj and Svendsen 1984) in which a third ACP was confirmed shortly thereafter (Hansen 1987). A survey with selected bacteria and members of the plant kingdom disclosed the presence of two or more isoforms in all leaves regardless of the ploidy level (Battey and Ohlrogge 1990). This is a minimal estimate corresponding to the number of 3H-palmitate labelled ACPs that can be resolved by native gel electrophoresis. Recently Schmid and Ohlrogge (1990) on the basis of their c D N A and m R N A studies proposed that the spinach leaf isoform (ACP II) which also occurs in roots and developing seeds may be a constitutive gene product, whereas the somewhat more abundant leaf ACP I is expressed organ spe-

TCACTTCTCCCCCTCAGCTCCi~CCCCACCCCACCCACCCAGTCCGCCCCCGTCGCTCTCACGCTCTTCG +I

+7 +98 M A

H

C

L

A

A

V

S

extl. ICCACCGCCCCCAAATTCTACCGAGCAGCTCAGCCGGCCAACCCATGGCGCACTGCCTCGCCGCCGTTTCC TGCTCGCCGCCTCCCTTCGCGTGCTGCCTCATGGCTTCCATCGCCGGATCTGCCGTCTCCTTCGCCAAGC extlll, lextlll, 2 M

A

S

I

A

G

S

A

V

S

F

A

K

+85

Fig. 6A-D. Autoradiograms of the primer extension products of AcIi and Acl3 transcripts. A AclI extension products obtained from the primer extI. 1 (D) annealed to mRNA from 18 h greening barley seedlings. The sequencing ladder represents the genomic sequence of AelI obtained by priming with extI. 1. B Acl3 extension products obtained with the primer extIIIA (D) annealed to mRNA from

18 h greening barley seedlings. The sequencing ladder represents the genomic sequence of Ael3 obtained with extIII. 1. C Ael3 extension products obtained with the primer extIII.2 (D) annealed to mRNA from barley seedlings grown under alternating photoperiod and temperature cycles. The sequencing ladder represents the genomic sequence of Acl3 obtained with extIII.2. D Upper lines, a segment of the Acll promoter and gene with the extension primer sequence underlined and potential TATA and CAAT boxes as well as transcription start sites indicated in large letters; lower lines, analogous information for Ael3 plus boxes to identify GC elements

475 Table 1. Positions and sequences of TATA elements plus transcripton and translation start motifs TATA box

Acll

Acl3

Consensus

Position

Transcription

Position

Translation

Distance to start

Motif

-32

32

TCC

GAG

+1

98

GGAG ~A'E~ GTGCA -98

39 51

GCC A TTT ~

AGG TCT

+7 -47

9t 145

GACG G/IXIA GGGCG -99

31 38 43

CTC A TCC ~ ACT C

GCT CCC CCC

+1 +7 -56

TCAC ~

32 ~ 7

CTC ~

TCA

CCAC

TATA

CGCG C A T A

~TTGC

AATGT

-31

TATAG

Length of leader sequence A

Motif ACCC

ATG

GC

85 78 141

CCTC A T G

GC

40- 80

~CA

GC

~

Consensus sequences for TATA elements, transcription start sites and the translation starts are from Joshi (1987). Numbering and distances are based upon: the first T in the TATA motif, the middle A in the transcription start and the A in the ATG Codon

Table 2. Promoter elements presents in AcI3 but not in Acll GC elements - 96 A G G G C G G -111 G G G G C G A -116 T G G C C G G

TGT GOA GGC

G GGGCGG T

GGC AAT

-87 -102 -107

Consensus

-- 500 region Acl3 -- 588 C C G C T T C A A A C G A A C A A A A A A C G G A C A Ale --494 CCGACCCAAACAGACAAAAAACGAACA $ @ ~ @



@@

~



AAACACGCGTCCGTTTGGATTGCGCGGGTGGAATTGCT-524 AATTCGCC~TCCGTTTGGATCGTCCGT T G G A C T T G C T -431 The consensus sequence for the GC elements is from Kadonage et al. (1986), the Ale sequence is from Whittier et al. (1987). Asterisks denote non-identical bases in the Acl3 and Ale sequences; arrows represent the inverted repeat cifically, that is in leaves. Genes for two barley leaf isoforms, AclI coding for the predominating A C P I and Acl3 coding for a minor isoform A C P III, have now been characterized. PreACP I comprises 59 amino acids serving as a transit peptide for the 90-amino acid mature protein,, while p r e A C P I I I has 49 and 83 residues, respectively, in the two domains. Despite the small size o f these proteins, expression of each requires m o r e than 3 kb of D N A . Thus, the primary unprocessed transcripts are comprised of 2598 and 2740 bases which are preceded by the requisite p r o m o t e r regions. Each gene is a mosaic of four exons and three introns. Lengths of the corresponding units are similar although the introns of Acl3 are somewhat longer than those of Acli. Calculated f r o m the genomic sequences, processed transcripts minus the poly(A) tails should range in size from 744 to 898 bases. This is in accordance with the 1050 bases estimated f r o m Northern analyses using m R N A f r o m seedling leaves (Hansen 1987). When the genomic sequences are compared, exons I I I plus IV, coding for most or all of the mature ACPs, exhibited 71.9% homology, which results in 73.5% identical amino acids out of 83. Interestingly an extra seven residues occur at the anaino-terminal end

of A C P I. The extra residues m a y have arisen f r o m intron sliding after creation of a new splice site toward the 3' end of the second intron. Alternatively, a small 18 bp in-frame duplication event including the last three bases of the original intron 2 plus the first five codons of exon III, positions 1944-1961 in Fig. 5, m a y have taken place to give the nucleotides located at positions 1926-1943. In these two regions 14/18 bp are identical. The duplication created three new bases at the end of the intron and six codons of which the bases in the last codon were part of the original intron. Further analyses of genomic sequences failed to reveal homology between exons I plus II coding for the transit peptide, the three introns, the 5' and 3' non-coding ends of the processed transcripts and the flanking non-transcribed regions. This brief s u m m a r y of structural facets of AclI and Acl3 highlights their striking similarity, and tends to obscure a more interesting feature, namely the lack of h o m o l o g y between the two p r o m o t e r regions of 650 and 1500 bases. Characterization of these p r o m o t e r s can be carried out in barley transient transformation systems and/or in a tobacco or Arabidopsis stable transformation system. N o t only can the hypothesis (Schmid and Ohlrogge 1990) be tested that leaves contain two or more active A C P genes, one being constitutive (in this case Acl3) and the other tissue-specific (Acll), but the pertinent sequences can be delimited. In the following the architecture of these two m o n o c o t genes and their products is compared to that in several other plants, all dicots, whereafter the nature of plant A C P gene families is considered. The three other A C P genes for which sequences have been published, one in Arabidopsis (Post-Beittenmiller et al. 1989a) and two in rape (de Silva et al. 1990), have the same composite structure as those in barley. The most noticeable difference is that two of the introns in the dicot genes are considerably shorter; namely intron 1, 270-445 versus 1083-1184 bp and intron 2, 71-80 versus 568-638 bp. This conservation of gene structure a m o n g the few distantly related plants studied is in line with the proposal that A C P gene duplication took place before evolution of multicellular plants, another hypothesis derived f r o m results of analyses of the minimal n u m b e r of A C P isoforms (Battey and Ohlrogge 1990).

476 A markedly different situation has been described for the rbcS type genes for which four different mosaic variations of the presumed ancestral gene are known (Wolter et al. 1988). Predictably the nucleotide sequence of exons III plus IV in the dicots is quite similar to that in barley, ranging from 63% to 66% homology which translates into 52%-66% identical amino acids in the mature proteins. The site of the third intron and the 13 residues encompassing it are invariant in all five genes, presumably reflecting a requirement for functional integrity of this portion of the mature ACP domain. The aminoterminal ends of rape, spinach and Arabidopsis ACPs are similar to that of barley ACP III in lacking the seven extra residues of barley ACP I. This implies that the duplication event described above giving rise to the extra residues of ACP I must have taken place after divergence of monocots and dicots. As noted above, amino acid sequencing of barley and spinach ACPs I and II revealed two transit peptide cleavage sites. Regardless of which site is used in barley ACP III, the position is not correlated with the prevalence of the leaf ACP isoforms. Previous analyses of the primary structures of ACP transit peptides from three distantly related plant families, barley, spinach and the crucifers rape and Arabidopsis (Hansen 1987; de Silva et al. 1990; Schmid and Ohlrogge 1990) demonstrated a total lack of homology among them. The same is true if the two spinach transit peptides are compared, despite the fact that both are considered to be located in chloroplasts (Ohlrogge and Kuo 1985; Schmid and Ohlrogge 1990). Now that the complete barley ACP III m R N A sequence is known, the analogous lack of a relationship between barley ACP I and III is evident. Barley ACP I occurs in chloroplasts (Hoj and Mikkelsen 1982; Hoj and Svendsen 1983), but the intracellular location of ACP III has not been established (Hoj and Svendsen 1984; Hansen 1987). Surprisingly, alignment of 49 residues of the transit peptides of barley ACP III and spinach ACP II discloses 32% identity, and if conservative substitutions are allowed the 11 amino-terminal residues are preserved. The recently obtained transit peptide sequence (Hansen and Kauppinen 1991) of the chloroplast-localized barley ACP II (Hoj and Mikkelsen 1982; Hoj and Svendsen 1984) has 33 out of 49 residues identical to that of barley ACP III, but is not homologous to ACP I. Although the first spinach ACP I import studies have been carried out (Fernandez and Lamppa 1990), additional ones are needed in spinach as well as in barley to untangle the skein of apparently conflicting observations just cited. Given the disparity between the primary structures of ACP transit peptides, the relative position of its intron has been well maintained, being between amino acids 16 and 17 except in barley ACP I where it is between 23 and 24. Alignment of the two promoter regions for Acll and Acl3 demonstrates several noteworthy differences. Firstly, the G + C content is remarkably higher in the proximal region of Acl3 than of Acll. The 300 bp region covering the transcription initiation sites plus the leader sequence and the first 20 bases of the open reading frame of Acl3 (positions - 1 9 5 to +105) has a 66.I% G + C

content compared to 48.0% for the first 1605 bases (positions - 1 5 0 0 to + 105). The corresponding regions in Acll contain 50.8% (positions - 1 8 0 to +120) versus 41.6% (positions - 6 5 0 to +120) G + C , respectively. Secondly, Acl3 lacks convincing TATA boxes, whereas these can be found in Aell (Fig. 6 and Table 1). Thirdly, in Acl3 three GC elements (Fig. 6 and Table 2), which serve as recognition sequences for the Spl factor of R N A polymerase II (Mitchell and Tijan 1989), are located approximately 40-120 bases upstream from the transcription start sites. Similar elements are lacking in the promoter region of Acll. These three features, which distinguish the promoter of Acl3 from that of Acll, are characteristic for promoters of mammalian housekeeping genes (Boyer etal. 1989; Linton etal. 1989; Zot and Fambrough 1990). Such genes are defined as those that are expressed in all types of tissues with the corresponding proteins representing only a small amount of the total cell protein. The multiple transcription initiation sites associated with this group of mammalian genes (Lin et al. 1990) occur also in the Acl3 promoter and that of the carrot V-type H + ATPase catalytic subunit (Struve et al. 1990), one of the few plant housekeeping genes thus far described. The fact that the Acl3 promoter region resembles that for housekeeping genes is in full agreement with the above speculation that barley ACP III like spinach ACP II may be such a gene product, whereas both barley and spinach ACP I are determined by leaf-expressed genes. Results of preliminary Northern analyses in which Acll transcripts can be detected in photosynthetic tissue whereas Acl3 transcripts can also be detected in developing and germinating seeds are in accordance with the proposal. If this scenario is true, other cis-acting D N A sequences unique to the Acll promoter must direct the tissue-specific expression, as well as preventing the three identified transcription start sites in the Acll promoter from functioning in the seed tissues. Obviously before the housekeeping designation can be accepted for specified plant ACP genes considerable additional knowledge must be accrued. As only short segments of the promoter regions of the seed-expressed Arabidopsis and rape genes are published (Post-Beittenmiller et al. 1989a; de Silva et al. 1990), a comparison with those of the leaf-expressed barley genes must be circumspect. RNase protection analysis of transcription start sites for one of the rape seed-expressed genes showed only one initiation site (de Silva et al. 1990). The multiple polyadenylation sites for Acl3 transcripts are also found for the Ael2 gene determining barley ACP II (Hansen and Kauppinen 1991) and for spinach ACP II (Schmid and Ohlrogge 1990). The three barley ACP genes have been localized to chromosomes using the Chinese Spring wheat - Betzes barley addition lines. No cross-hybridization occurred in the Southern analyses. In this study Acll was shown to be on chromosome 7 and Acl3 on chromosome 1. Similar analyses assign Acl2 to chromosome 7 (Hansen and Kauppinen 1991) at least 10 kb distant from Acll. Single bands too small to accommodate more than one copy of a gene were identified. Combining these observa-

477 tions with the estimation of 1-3 copies of Acll obtained by titration, we conclude that each gene occurs once in the barley genome. Which of these corresponds to the two ancestral genes is an intriguing puzzle for the future. The h o m o l o g y between the transit peptides determined for the Acl2 and Acl3 genes suggests that one is a duplication of the other. The characterization of the barley A C P genes as single copy leads to the prediction that three copies of each will occur in Chinese Spring wheat, one in each of the homoeologous A, B and D genomes. This hypothesis explains the three hybridizing bands detected using the Acll probe and three of the four bands with the Acl3 probe. The presence of the extra band has been confirmed in a more extensive study (Devos et al. 1991), and presumably represents a duplication of one of the three wheat genes which has taken place after divergence of the wheat and barley progenitors. The latter work also demonstrates that the wheat genes hybridizing to Acll are on the short arms of chromosomes 5A, 5B and 5D, those hybridizing to Acl3 are on the short arms of 7A, 7B and 7D and the new copy of Acl3 is on the long a r m of 5B. The absence of the latter in some of the wheat lines studied testifies to the recent nature of the duplication event. These results also imply that Acll and Acl3 are very likely to be on the short arms of the barley chromosomes. The A C P gene families in spinach and rape appear to be m o r e complex since multiple bands are seen in Southern analyses. This was unexpected in spinach on the basis of the available elegant protein analyses, and led to the proposal of two subfamilies each probably having a n u m b e r of pseudogenes (Schmid and Ohlrogge 1990). Unfortunately, only one partial and one complete c D N A clone for A C P II from a root library (Schmid and Ohlrogge 1990) and one complete c D N A clone for A C P I f r o m a leaf library (Scherer and K n a u f 1987) were analysed, so no information is available concerning how m a n y different m R N A s are expressed and in what proportions. By comparison, f r o m barley leaf c D N A libraries more than 19 clones have been identified coding for A C P I, three for A C P I I I and two for A C P II (Ha nsen and Kauppinen 1991). In rape 34-36 A C P gene copies per haploid genome have been estimated. A m i n i m u m of four is expected since rape is an amphidiploid that arose by crossing of Brassica campestris and B. oleracea followed by c h r o m o s o m e doubling. The detection of more than two bands in Southern blots of Arabidopsis (unpublished observation) and three 3H-palmitate labelled proteins in leaves (Battey and Ohlrogge 1990) implies that some of the additional duplication events m a y well have taken place before rape was created, thus at that time more than four A C P gene copies were already present. Five of the ten seed c D N A s studied and both genes were most probably derived from a B. campestris gene, whereas the other five seed c D N A clones presumably have descended f r o m a B. oleracea gene (Hansen 1987; de Silva et al. 1990). This s u m m a r y infers that all described rape genes belong to a subfamily descended from one of the two ancestral genes. The three single barley A C P genes and the two subfamilies of spinach

and rape A C P genes offer challenges for exploiting this c o m p o n e n t of the lipid biosynthetic pathway. Interestingly, the first observations on the condensing enzymes in this pathway imply that barley m a y have a more complex gene family for this FAS c o m p o n e n t than spinach (Siggaard-Andersen et al. 1991). Acknowledgements. We thank Sakari Kauppinen for the barley leaf cDNA libraries and the ACP I clone containing the poly(A) tail, and Nina Rasmussen and Ann-Soft Steinholtz for preparing the figures. The 25 seeds of each of the addition lines from a karyotyped plant were a gift from Hanne Bay Johansen and Ib LindeLaursen, Riso National Laboratory, Roskilde, Denmark. The Acll and Acl3 nucleotide sequences are deposited in the GenBank/ EMBL Data Bank under the accession numbers M58753 and M58754.

References Battey JF, Ohlrogge JB (1990) Evolutionary and tissue-specific control of expression of multiple acyl-carrier protein isoforms in plants and bacteria. Planta 180: 352-360 Benton WD, Davis RW (1977) Screening 2gt recombinant clones by hybridization to single plaques in situ. Science 196:180-182 Boom TV, Cronan JE Jr (1989) Genetics and regulation of bacterial lipid metabolism. Annu Rev Microbiol 43 : 317-343 Boyer TG, Krug JR, Maquat LE (1989) Transcriptional regulatory sequences of the housekeeping gene for human triosephosphate isomerase. J Biol Chem 264:5177-5187 Chuman L, Brody S (1989) Acyl carrier protein is present in the mitochondria of plants and eucaryotic micro-organisms. Eur J Biochem 184:643 649 Devos KM, Chinoy CN, Atkinson MD, Hansen L, WettsteinKnowles P von, Gale MD (1991) Chromosomal location in wheat of the genes for the acyl carrier proteins I and III. Theor Appl Genet 82: 3-5 Elhussein SA, Miernyk JA, Ohlrogge JB (1988) Plant holo-(acyl carrier protein) synthase. Biochem J 252:3%45 Feinberg AP, Vogelstein B (1983) A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity. Anal Biochem 132:(~13 Fernandez MD, Lamppa GK (1990) Acyl carrier protein (ACP) import into chloroplasts does not require the phosphopantetheine: evidence for a chloroplast holo-ACP synthase. Plant Cell 2:195-206 Guerra D J, Ohlrogge JB (1986) Partial purification and characterization of two forms of malonyl-coenzyme A: acyl carrier protein transacylase for soybean leaf tissue. Arch Biochem Biophys 246:274-285 Hanley BA, Schuler MA (1988) Plant intron sequences: evidence for distinct groups of introns. Nucleic Acids Res 16:7159-7176 Hansen L (1987) Three cDNA clones for barley leaf acyl carrier proteins I and III. Carlsberg Res Commun 52:381-392 Hansen L, Kauppinen S (1991) Barley acyl carrier protein II: Nucleotide sequence of cDNA clones and chromosomal location of the AcI2 gene. Plant Physiol (in press) Hansen L, Wettstein-Knowles P von (1989) Acyl carrier proteins of barley seedling leaves and caryopses. In: Biacs PA, Gruiz K, Kremmer T (eds) Biological role of plant lipids. Akad+miai Kiad6, Budapest and Plenum, New York, pp 367-370 Hattori M, Sakaki Y (1986) Dideoxy sequencing method using denatured plasmid templates. Anal Biochem 152 : 232-238 Heijne G von, Steppuhn J, Herrmann RG (1989) Domain structure of mitochondrial and chloroplast targeting peptides. Eur J Biochem 180:535-545 Henikoff S (1984) Unidirectional digestion with exonuclease III creates targeted breakpoints for DNA sequencing. Gene 28:351-359

478 Hoj PB, Mikkelsen JD (1982) Partial separation of individual enzyme activities of an ACP-dependent fatty acid synthetase from barley chloroplasts. Carlsberg Res Commun 47:119-141 Hoj PB, Svendsen I (1983) Barley acyl carrier protein: its amino acid sequence and assay using purified malonyl-CoA:ACP transacylase. Carlsberg Res Commun 48:285 305 Hoj PB, Svendsen I (1984) Barley chloroplasts contain two acyl carrier proteins coded for by different genes. Carlsberg Res Commun 49 : 483-492 Islam AK, Shepherd KW, Sparrow DH (1981) Isolation and characterization of euplasmic wheat-barley chromosome addition lines. Heredity 46:161-174 Joshi CP (1987) An inspection of the domain between putative TATA box and translation start site in 79 plant genes. Nucleic Acids Res 15:6643-6653 Kadonaga JT, Jones KA, Tjian R (1986) Promoter-specific activation of RNA polymerase II transcription by Spl. Trends Biochem Sci 11 : 20-23 Kuo TM, Ohlrogge JB (1984) The primary structure of spinach acyl carrier protein. Arch Biochem Biophys 234: 290-296 Lin D, Shi Y, Miller WL (1990) Cloning and sequencing of human adenodoxin reductase gene. Proc Natl Acad Sci USA 87:85168520 Linton JP, Yen J-Y J, Selby E, Chen Z, Chinsky JM, Liu K, Kellems RE, Crouse GF (1989) Dual bidirectional promoters at the mouse dhfr locus : cloning and characterization of two mRNA classes of the divergently transcribed Rep-1 gene. Mol Cell Biol 9: 3058-3072 Matsumura S, Stumpf PK (1968) Fat metabolism in higher plants XXXV. Partial primary structure of spinach acyl carrier protein. Arch Biochem Biophys 125:932-941 Mitchell PJ, Tjian R (1989) Transcriptional regulation in mammalian cells by sequence-specific DNA binding proteins. Science 245: 371-378 O'Brien SJ (1990) Genetic maps. Locus maps of complex genomes, 5th edn. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York Ohlrogge JB, Kuo TM (1985) Plants have isoforms for acyl carrier protein that are expressed differently in different tissues. J Biol Chem 260: 8032-8037 Post-Beittenmiller MA, Hlougek-Radoj~i6 A, Ohlrogge JB (1989a) DNA sequence of a genomic clone encoding an Arabidopsis acyl carrier protein (ACP). Nucleic Acids Res 17:1777 Post-Beittenmiller MA, Schmid KM, Ohlrogge JB (1989b) Expression of holo and apo forms of spinach acyl carrier protein-I in leaves of transgenic tobacco plants. Plant Cell 1 : 889-899 Rose RE, De Jesus CE, Moylan SL, Ridge NP, Scherer DE, Knauf VC (1987) The nucleotide sequence of a cDNA clone encoding acyl carrier protein (ACP) from Brassica campestris seeds. Nucleic Acids Res 15:7197 Safford R, Windust JH, Lucas C, Silva J de, James CM, Hellyer

A, Smith CG, Slabas AR, Hughes SG (1988) Plastid-localised seed acyl-carrier protein of Brassica napus is encoded by a distinct, nuclear multigene family. Eur J Biochem 174: 287-295 Sambrook J, Fritsch EF, Maniatis T (1989) Molecular cloning a laboratory manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-termination inhibitors. Proc Natl Acad Sci USA 74: 5463-5467 Scherer DE, Knauf VC (1987) Isolation of a cDNA clone for the acyl carrier protein-I of spinach. Plant Mol Biol 9:127-134 Schmid KM, Ohlrogge JB (1990) A root acyl carrier protein-II from spinach is also expressed in leaves and seeds. Plant Mol Biol 15:765-778 Siggaard-Andersen M, Kauppinen S, Wettstein-Knowles P von (1991) Primary structure of a cerulenin binding/~-ketoacyl-[acyl carrier protein] synthase from barley chloroplasts. Proc Natl Acad Sci USA 88:4114-4118 Silva J de, Loader NM, Jarman C, Windust JH, Hughes SG, Safford R (1990) The isolation and sequence analysis of two seedexpressed acyl carrier protein genes from Brassiea napus. Plant Mol Biol 14:537-548 Simpson D, Wettstein-Knowles P von (1980) Structure of epicuticular waxes on spikes and leaf sheaths of barley as revealed by a direct platinum replica technique. Carlsberg Res Commun 45:465-481 Southern EM (1975) Detection of specific sequences among DNA fragments separated by gel electrophoresis. J Mol Biol 98: 503517 Struve I, Rausch T, Bernasconi P, Taiz L (1990) Structure and function of the promoter of the carrot V-type H +-ATPase catalytic subunit gene. J Biol Chem 265:792%7932 Stumpf PK (1987) The biosynthesis of saturated fatty acids. In: Stumpf PK (ed) The biochemistry of plants, vol 9. Lipids : structure and function, Academic Press, New York, pp 121-136 Wettstein-Knowles P von (1989) Facets of the barley genome. Votr Pflanzenzfichtg 16:107-124 Whittier RF, Dean DA, Rogers JC (1987) Nucleotide sequence analysis of alpha-amylase and thiol protease genes that are hormonally regulated in barley aleurone cells. Nucleic Acids Res 15:2515-2535 Wolter FP, Fritz CC, Willmitzer L, Schell J, Schreier PH (1988) rbcS genes in Solanum tuberosum: Conservation of the transit peptide and exon shuffling during evolution. Proc Natl Acad Sci USA 85:846-850 Zot AS, Fambrough DM (1990) Structure of a gene for a lysosomal membrane glycoprotein (LEPI00). J Biol Chem 265:2098820995 C o m m u n i c a t e d b y J. Schell