Chloroplast Ribosomes and Protein Synthesis - Europe PMC

8 downloads 0 Views 12MB Size Report
This section will be followed by discussion of the tRNAs and aminoacyl-tRNA synthetases.A more detailed discussion of chloroplast protein synthesis has been ...
Vol. 58, No. 4

MICROBIOLOGICAL REVIEWS, Dec. 1994, p. 700-754

0146-0749/94/$04.00+0 Copyright © 1994, American Society for Microbiology

Chloroplast Ribosomes and Protein Synthesis ELIZABETH H. HARRIS,`* JOHN E. BOYNTON,' AND NICHOLAS W. GILLHAM2 DCMB Group, Departments of Botany' and Zoology,2 Duke University, Durham, North Carolina 27708-1000 .700

ORIGIN OF CHLOROPLASTS................................................................ CHLOROPLAST GENOME STRUCTURE AND GENE CONTENT.. THE PROCESS OF CHLOROPLAST PROTEIN SYNTHESIS...........

..700 ..702 .....................................................................................................................................t.a..i.....n.... ... .....1....'rnl Initiation kElongation.........417d 0304 A 704 Chloroplast tRNAs and Aminoacyl-tRNA Synthetases ..................................................................... 704 PLASTID GENES FOR rRNAs ................................................... -s

F'

-

..................

704 Phylogenetic Conservation ........................ General Characteristics of Chloroplast rRNA Gene Organization .........................................*707 ...........................

..................

707 16S rRNA ..................................................................... 23S rRNA ..................................................................... 709 5S rRNA ..................................................................o.. 709 709 Introns in rRNA Genes ..................................................................... 712 The 16S-23SSpacer .................................................................... 712 tRNAs Flanking the rRNA Operons..................................................................... ...2.0..................712 Antibiotic Resistance Mutations in the Chloroplast rRNA Genes ......................1.. 714 RIBOSOMAL PROTEINS ..................................... 714 Number and Nomenclature ..................................................................... 715 Organization of Chloroplast Ribosomal Protein Genes ....................................... ............*................. 716 Correspondence of Chloroplast Ribosomal Proteins to Bacterial Ribosomal Proteins ................ ............... 716 Proteins of the Small Subunit................................ 726 Proteins of the Large Subunit. Chloroplast Ribosomal Proteins with No Obvious Homology to Those of E. coli . 729 ........................30 Comparative Analysis of Ribosomal Proteins ............................... ASSEMBLY OF CHLOROPLAST RIBOSOMES ..................................................................... 730 SYNTHESIS OF THE COMPONENTS OF CHLOROPLAST RIBOSOMES .............................................., 731 731 Transcription of rRNA Transcription of Chloroplast Genes Encoding Ribosomal Proteins ..................................................................... 732 ..........733 Posttranscriptional Regulatory Mechanisms Affecting Chloroplast mRNAs .......................... Membrane Binding of Chloroplast Ribosomes ..................................................................... 733 734 HOW ESSENTIAL IS CHLOROPLAST PROTEIN SYNTHESIS?...........-..................'............... 735 CONCLUSIONS ..................................................................... 735 ....................... ACKNOWLEDGMENTS. 73 REFERENCES .................... .....................

...........

.....................................

...........................................

..........................

................

..

Genes..................................................................

..........

...........................

...................

................................I............"o ..

among these various taxa have produced intriguing directions for future evolutionary studies, while analysis of ribosomal protein sequences, particularly among the diverse algal groups, promises. to be a valuable tool for determining conserved regions likely to have essential functions in ribosome assembly or protein synthesis.

ORIGIN OF CHLOROPLASTS

Chloroplasts and mitochondria contain protein synthesizingsystems more similar to those of bacteria than to those of the

eukaryotic cytoplasm, consistent with the hypothesis that these organelles had xenogenous (endosymbiotic) rather than autogenous (intracellular differentiation) origins (see. references 5, 205, 220-223, 274, 633, and 694 for discussions). Phylogenies based mostly on rRNA sequences indicate that the cyanobacteria are ancestral to chloroplasts while the members of the alpha subdivision of the purple sulfur bacteria are the likely progenitors of mitochondria (221, 222). Whether the, chlorophyte algae and land plants on the one hand, and the rhodophyte, chromophyte, and euglenoid algae on the other represent more than one endosymbiotic event remains unresolved (130, 403, 434). Comparisons of gene order and arrangement

CHLOROPLAST GENOME STRUCTURE AND GENE CONTENT Unlike their prokaryotic ancestors, neither chloroplasts nor mitochondria are genetically autonomous, and information specifying components of the organelle protein synthesizing systems is divided between organelle and nucleus. Separation of the genes encoding these RNAs and proteins between two discrete cellular compartments suggests that mechanisms must have evolved to coordinate expression of these genes so that protein synthesis in the organelle can proceed efficiently. Whereas chloroplast genomes of land plants usually have a common organization and gene content, a great deal more variability is encountered among the algae, particularly with

* Corresponding author. Mailing address: DCMB, Duke University Box 91000, Durham, NC 27708-1000. Phone: (919) 613-8164. Fax: (919) 613-8177. Electronic mail address: [email protected].

700

CHLOROPLAST RIBOSOMES AND PROTEIN SYNTHESIS

VOL. 58, 1994

FIG. 1. Schematic diagram of a typical land plant chloroplast showing the positions of the inverted repeat, rRNA

genome (tobacco), genes, and genes

encoding ribosomal proteins. Gene locations

are

from reference 560.

regard to the ribosomal protein genes that have been retained in the organelle. In this section we review the chloroplast genome structure

of land plants and the algal genera that have

been investigated to date with respect to composition and organization of genes encoding rRNAs and ribosomal proteins. Chloroplasts are highly polyploid organelles containing circular DNA molecules of 85 to 200 kb organized into discrete membrane-associated nucleoids (see references 50, 206, 260, 337, 482, 483, 558, 613, and 614 for reviews). Three land plant chloroplast genomes have been completely sequenced: the dicotyledon tobacco (Nicotiana tabacum, 156 kb [560, 561]), the monocotyledon rice (Oryza sativa, 135 kb [265]); and a liverwort (Marchantia polymorpha, 121 kb [471-473]). Each contains 110 to 120 genes (482, 614). These sequences, as well as restriction maps and partial sequences from many other species, indicate that the basic chloroplast genome structure and overall gene order in land plants are highly conserved. Although green algae (Chlorophyta) are regarded as ancestral to land plants, modern green algae often show substantial rearrangements in chloroplast gene order (see below). Other of algae (Rhodophyta, Euglenophyta, Chromophyta) show even more diversity in gene content and organization. In the typical land plant chloroplast genome, unique sequence regions of 15 to 25 kb and 80 to 100 kb are separated

groups

by the two copies of an inverted repeat, which is usually 20 to 30 kb in size and contains genes encoding the chloroplast rRNAs, certain tRNAs, and often one or more genes specifying proteins (Fig. 1) (see references 482 and 614 for reviews). Within the inverted repeat, the rRNA operon is usually oriented with the 23S rRNA gene closer to the small singlecopy region and the 16S rRNA gene closer to the large single-copy region. The two repeats are identical in sequence as a consequence of an active copy correction system (50). Nearly two-thirds of the variation in size among land plant chloroplast genomes (120 to 216 kb) is accounted for by expansion or contraction of the inverted repeat (482). The smallest chloroplast genomes among land plants are seen in conifers (355, 600, 695) and in six tribes of the legume family

701

Fabaceae (314, 482, 487), which have lost the inverted repeat and thus contain only a single copy of each of the rRNA genes. Black pine (Pinus thunbergii) chloroplast DNA does possess a short inverted repeat sequence, which contains a tRNA gene and part of the 3' portion of the psbA gene, but not the rRNA genes (654). In contrast, species with the largest chloroplast genomes often have expanded inverted repeats (e.g., Pelargonium hortorum has a 76-kb inverted repeat encompassing nearly half of the 216-kb chloroplast genome, in which many genes normally in the single-copy region have been duplicated [482]). Chloroplast genomes from land plants specify a relatively constant set of components for the protein-synthesizing machinery of the organelle (4 rRNAs, 30 to 31 tRNAs, 21 ribosomal proteins, and 4 RNA polymerase subunits) and for photosynthesis (28 thylakoid proteins plus 1 soluble protein, the ribulose-1,5-bisphosphate carboxylase/oxygenase [Rubisco] large subunit). In addition, homologs of 11 subunits of mammalian mitochondrial complex I (the ndh genes) have now been found to be encoded by chloroplast DNA in flowering plants and Marchantia species (9,713). Chloroplast genomes of gymnosperms, liverworts, and algae (e.g., Chlamydomonas reinhardtii) which synthesize chlorophyll in darkness possess genes encoding three subunits of a light-independent protochlorophyllide reductase that is also found in photosynthetic prokaryotes (see reference 367 for a summary). These genes are absent from the tobacco and rice chloroplast genomes. Mapping and sequencing studies of chloroplast genomes from widely different algal taxa reveal that these are much more variable in organization and gene content than those of land plants. The well-characterized chloroplast genomes of three species of unicellular green algae in the genus Chlamydomonas are substantially larger (C. reinhardtii, 196 kb; C. eugametos, 243 kb; C. moewusii, 292 kb) than the chloroplast genomes of land plants (42, 43, 50, 247). In these species the two copies of the large inverted repeat encoding the rRNAs are separated by unique sequence regions of roughly equal size. Chloroplast genes in Chlamydomonas species are also extensively rearranged between distantly related species and with respect to land plants (43). The green alga Spirogyra maxima, in the charophyte lineage presumed to be ancestral to land plants, lacks an inverted repeat and shows altered gene order relative to land plants (352, 393). The organization, structure, and gene content of the completely sequenced 145-kb chloroplast genome of Euglena gracilis Z (243) depart markedly from the chloroplast genomes of chlorophyte algae or land plants. In this Euglena strain and in its colorless relative Astasia longa, the plastid genome contains three tandemly repeated rRNA operons plus an additional 16S gene or fragment thereof (288, 289, 315-317, 478, 569). Euglena gracilis var. bacillaris has only a single complete rRNA operon (720). Most of the Euglena chloroplast tRNA genes are grouped in tight clusters of two to five genes, whereas they tend to be scattered in plastid genomes of land plants. While most protein-coding chloroplast genes in land plants or Chlamydomonas species are uninterrupted or contain at most one or two introns, comparable genes in Euglena gracilis each contain multiple introns (243, 482). However, several chloroplast tRNA genes that have introns in land plants lack introns in Euglena or Chlamydomonas species (331). A number of other genes found in land plant chloroplast genomes, including three genes encoding ribosomal proteins, are missing from the Euglena chloroplast genome (see below), but this algal genome also contains some genes not found in plastid genomes of land plants. The plastid of Cyanophora paradoxa is often referred to as

702

HARRIS ET AL.

MICROBIOL. REV.

the cyanelle because of its secondary peptidoglycan wall and photosynthetic apparatus with phycobiliproteins typical of cyanobacteria and red algae. Most of the 133-kb cyanelle genome

chloroplast protein synthesis has been presented by Steinmetz and Weil (593).

has now been sequenced (32, 598). This genome contains an inverted repeat which encodes the cyanelle rRNAs and several other genes. Although the gene content of the cyanelle generally resembles that of land plant chloroplasts, there are about 30% more genes, including 11 additional genes encoding components of the translational apparatus. So far, only a single type I intron has been found in Cyanophora paradoxa, in a tRNAIeU gene (162). The same intron is found in cyanobacteria. Cryptomonad algae contain a plastid and residual nucleus or nucleomorph enclosed within the endoplasmic reticulum of the cytoplasm and thus effectively separated from the normal cell nucleus (see 123, 386). Distinctly different 18S rRNAs are encoded by the nucleus and nucleomorph of Cryptomonas (D and are spatially separated within the cell (129, 414). The nucleomorph rRNA genes are related to those of red algae, while the nuclear rRNA genes are clustered separately in the phylogenetic branch containing land plants and green algae. This suggests that cryptomonad algae may have arisen through a second endosymbiotic event in which a eukaryotic symbiont from the red algal lineage was taken up by a unicellular host more closely related to the green algae (129, 130, 386). Partial sequencing of the plastid genome of Cryptomonas (D has revealed the presence of several novel genes, including four genes for ribosomal proteins not found in chloroplast genomes of land plants (122, 124, 680; also see below). In the red alga Porphyra purpurea, over 125 genes have been identified in the ca. 60% sequenced chloroplast genome (514, 515), suggesting that the entire genome may contain as many as 200 to 220 genes, about twice as many as found in the completely sequenced genomes of land plant chloroplasts. These include at least seven photosynthesis and nine ribosomal protein genes not present in land plants. Introns have not been found in any of the 80 genes sequenced to date. The chloroplast genome of P. yezoensis possesses an inverted repeat containing the rRNA genes (353, 562, 563), but the related red algae P. purpurea and Grijflthsia pacifica lack this invertedrepeat structure. In P. purpurea the rRNA genes are encoded in direct repeats which are not identical in sequence (514, 516, 563). The unicellular red alga Cyanidium caldarium possesses an inverted repeat containing only the rRNA genes, but gene order appears to be more similar overall to that of Cryptomonas (D than to that of P. yezoensis or Griffithsia pacifica (385). Inverted repeats containing rRNA genes are also found in the plastid genomes of the brown alga Dictyota dichotoma (330) and the golden-brown algae Olisthodiscus luteus and Ochromonas danica (108, 563). The plastid genome of the brown alga Pylaiella littoralis contains two different circular DNA molecules (369, 370, 404, 405). The larger (133 kb) molecule resembles a typical land plant chloroplast genome, with two rRNA operons in an inverted repeat. The smaller (58 kb) molecule contains a 16S pseudogene sequence, which is 65% homologous to the functional 16S genes of the large molecule, and a complex region that hybridizes with a 23S rRNA probe (369, 370).

Initiation

THE PROCESS OF CHLOROPLAST PROTEIN SYNTHESIS We begin this brief review of chloroplast protein synthesis with a comparison with the process as it occurs in bacteria. This section will be followed by discussion of the tRNAs and aminoacyl-tRNA synthetases. A more detailed discussion of

In prokaryotes, protein synthesis begins with formation of a preinitiation complex from the 30S ribosomal subunit and tRNAIMetUAC, with the 30S subunit binding to the purine-rich Shine-Dalgarno sequence 7 ± 2 nucleotides (nt) upstream of the initiator AUG (230, 261, 323). The canonical ShineDalgarno sequence, GGAGG, or a variant, pairs with a pyrimidine-rich complementary sequence, the anti-Shine-Dalgarno sequence, near the 3' end of the 16S rRNA molecule. Addition of a 50S ribosomal subunit converts the preinitiation complex to an initiation complex that can enter the elongation phase of protein synthesis. These reactions are promoted by the three initiation factors, IF-1, IF-2, and IF-3. IF-1 enhances the rates of ribosome dissociation and association and the activities of the other initiation factors (261). IF-2 is involved in initiator tRNA binding and GTP hydrolysis, while IF-3 prevents ribosomal subunit association in the absence of mRNA and appears to stabilize mRNA binding by promoting the conversion of a preternary ribosome-mRNA-fMet tRNA complex into a ternary complex in which codon-anticodon interaction has occurred. IF-3 also is thought to proofread the AUG-anticodon interaction. Chloroplast equivalents of IF-2 and IF-3Chl, have been characterand IF-3, designated ized from Euglena gracilis (212, 324, 325, 375,527, 678). Roney et al. 527) confirmed that IF-2Chl is required for binding of tRNA et to chloroplast 30S subunits, as is prokaryotic IF-2. IF-2,hl occurs in several complex forms, varying in molecular mass from 200 to 800 kDa (375). Subunits of 97 to >200 kDa have been observed in these preparations. IF-3,hl promotes Alinitiation complex formation in the presence of though IF-3Chl will replace Escherichia coli IF-3 in initiation complex formation, there is some evidence that its function may be modified (527). A DNA sequence with homology to the E. coli infA gene encoding initiation factor IF-1 has been identified in the chloroplast genomes of land plants, including the colorless parasite Epifagus virginiana (558, 714), but is apparently absent from the completely sequenced chloroplast genome of Euglena gracilis (243). The tobacco infA gene, in contrast to the spinach gene (571), lacks the ATG translation initiation codon and thus may be a pseudogene. Reading frames with homology to the genes encoding IF-2 and IF-3 have not been detected in the sequenced plastid genomes of green plants, Epifagus virginiana, or Euglena gracilis, and inhibitor experiments suggest that the Euglena genes specifying these factors are nuclear in location. However, homologs of the infB gene encoding IF-2 have been found in the chloroplast genomes of the red algae P. purpurea (514) and Galdieria sulphuraria (322). Lin et al. (359) have recently reported characterization of a cDNA clone encoding IF-3Chl in Euglena gracilis. This nuclear gene appears to be present in about four copies, one of which is probably a pseudogene. The putative protein contains two acidic regions with no homology to other known sequences, in addition to a 175-amino-acid region with 31 to 37% homology to other IF-3 proteins. Shine-Dalgarno-like sequences are present in the untranslated leader regions of many but not all chloroplast mRNAs (35, 44, 318, 532, 593, 746). Ruf and Kossel (532) reported that 37 of 41 chloroplast genes examined in tobacco have such sequences if one extends the anti-Shine-Dalgarno sequence in the 16S rRNA beyond the canonical CCUCC to include the adjacent unpaired ACUAG sequence. Bonham-Smith and

IF-2Ch,

IF-2Chi.

CHLOROPLAST RIBOSOMES AND PROTEIN SYNTHESIS

VOL. 58, 1994

Bourque (35) observed that 181 of 196 chloroplast-encoded transcripts examined possessed a Shine-Dalgarno sequence within 100 bp 5' to the initiation codon. However, spacing of Shine-Dalgarno sequences in chloroplast mRNAs is less uniform than in bacteria. Frequency distributions of the most common individual positions potentially involved in base pairing with 16S rRNA ranged from -2 to -29, with a major peak (ca. 40%) at -7 to -8, a smaller peak at -15 to -16, and a third small peak at -21 to -23 (35, 532). Thus, chloroplast ribosomes may be able to accommodate larger distances between the ribosome recognition site and translational start sites than bacterial ribosomes. For example, in the Chlamydomonas rpsl2 gene, a canonical Shine-Dalgarno sequence is found at position -55 upstream of the initiator codon (364). The variability of the Shine-Dalgarno sequence raises the question whether initiation from this sequence proceeds as in eubacteria for Shine-Dalgarno sequences close to the AUG codon and occurs by transient binding and "scanning" for more-distant Shine-Dalgarno sequences (306). In tobacco the mRNAs for those chloroplast genes lacking Shine-Dalgarno sequences either show only a trinucleotide sequence for potential base pairing (atpB) or contain out-of-frame initiator codons between the potential recognition sites and the respective in-frame start codons (rpsl6, rpoB, and petD [532]). In Euglena chloroplasts, mRNA-rRNA recognition seems to proceed by somewhat different rules, because the putative anti-Shine-Dalgarno sequence CUCCC differs from the canonical CCUCC sequence and actually forms the 3' terminus of the 16S rRNA rather than being located several bases from the end (592). Since only about half of the Euglena chloroplast mRNAs contain Shine-Dalgarno sequences, two modes of initiation complex formation have been postulated (527, 677). In one class of mRNAs, complex formation is facilitated by a Shine-Dalgarno-like sequence. However, in the second class the A+U content of the region 5' to the initiator AUG is 90% or greater and this portion of the mRNA is relatively unstructured, making potential start sites in this region readily accessible to small subunits. Koo and Spremulli (318, 319) have studied formation of initiation complexes in vitro with transcripts containing the 5' untranslated leader region of the Euglena rbcL mRNA, which is A+U rich and contains no Shine-Dalgarno sequence. Introducing a Shine-Dalgarno sequence into this region enhanced initiation only slightly. Deletion and/or modification of the leader region demonstrated that a minimum of about 20 nt is required to form the initiation complex in vitro and that the full 55-nt length is necessary for full activity in complex formation (318). The primary sequence of the region seems less important for initiation than does its length. The native 55-nt sequence has only weak secondary structure, and modification of the sequence to create increased secondary structure within about 10 nt of the AUG codon diminished formation of the initiation complex significantly (319). Koo and Spremulli concluded that the major determinant of initiation in those Euglena mRNAs with no Shine-Dalgarno sequence is presence of the AUG codon in an unstructured region of mRNA that is accessible to the 30S subunit. Elongation

Elongation of the peptide chain requires three steps, i.e., aminoacyl-tRNA binding, peptide bond formation, and translocation, and involves three binding sites for tRNA (261, 460, 518, 690). The aminoacylated tRNA combines with elongation factor EF-Tu and GTP to form a ternary complex, which then associates with a ribosome complexed to mRNA and peptidyl-

703

tRNA. The specific ternary complex is selected on the basis of codon-anticodon recognition at the A site and is followed by GTP hydrolysis and the release of an EF-Tu-GDP complex. Peptide bond formation takes place with transfer of the growing peptide chain to the aminoacyl-tRNA in the A site. Translocation is promoted by EF-G and GTP hydrolysis, and involves movement of the peptidyl-tRNA-mRNA complex from the A to the P site. The process is then repeated, and the deacylated tRNA moves from the P to the E site. The A and E sites themselves are allosterically linked in a negative sense so that occupation of the A site by aminoacylated tRNA reduces the affinity of the E site for deacylated tRNA and vice versa. Regeneration of the active EF-Tu-GTP complex from EF-TuGDP is mediated by elongation factor EF-Ts. All three elongation factors have been characterized from Euglena chloroplasts by Spremulli and colleagues (53, 145, 173, 341, 585), and the structure of the guanine nucleotide-binding domain of EF-Tu has been modeled by Lapadat et al. (341). EF-Tu has also been purified from pea and tobacco chloroplasts (445, 589). Reading frames with homology to the bacterial genes encoding the three elongation factors EF-Tu, EF-G, and EF-Ts are absent from the three completely sequenced land plant chloroplast genomes (482, 558), but some of these genes have been retained in the plastid genomes of certain algae (see below). Two distinct nuclear genes encoding chloroplast EF-Tu have been identified in tobacco (445, 611, 661). A nuclear EF-G gene has been cloned and sequenced from soybean (650), and a partial clone obtained from pea (2). Early inhibitor experiments with Euglena gracilis indicated that EF-Ts and EF-G were nuclear gene products but that EF-Tu might be encoded in the chloroplast (52, 173). These predictions were confirmed by identification of a chloroplast tufA gene encoding EF-Tu (429) and by failure to find genes encoding EF-Ts or EF-G in the recently completed Euglena chloroplast genome sequence (243). The Euglena tufA gene is split into three exons separated by two introns (429). An uninterrupted sequence with homology to the E. coli tufA gene has been reported from the chloroplast genome of C. reinhardtii (15, 684). The tufA gene sequence is also found in the Cyanophora cyanelle genome (32, 598) and in the chloroplast genomes of representative green algae in the families Ulvophyceae, Chlorophyceae, and Charophyceae, the latter group being the presumed ancestors of land plants (14, 15). However, tufA is absent from the chloroplast genome of the liverwort Marchantia polymorpha, representative of the earlier land plant lineages (472, 473). Baldauf and Palmer (15) concluded that transfer of this gene to the nucleus probably occurred in the charophycean lineage prior to the emergence of land plants. Reith and Munholland (514) have reported that the chloroplast genome of the red alga P. purpurea not only possesses a reading frame corresponding to tufA but also possesses one corresponding to tsf which encodes EF-Ts in prokaryotes. This gene has also been found in the chloroplast genome of the thermophilic red alga Galdieria sulphuraria (322). In prokaryotes, mutations to fusidic acid resistance can occur in the structural gene for EF-G (fus) (357). A nuclear mutation in C. reinhardtii has been reported to confer fusidic acid resistance on chloroplast EF-G, but the gene encoding this factor has not yet been identified (74; also see 247). Production of chloroplast protein synthesis factors appears to be light regulated. Spremulli and coworkers have shown that activities of Euglena IF-2, IF-3, EF-Tu, EF-G, and EF-Ts all increase on transfer of cells from dark growth to light (52, 173, 324, 585). In Chlamydomonas synchronous cultures, transcription rates for four chloroplast-encoded photosynthetic genes

704

MICROBIOL. REV.

HARRIS ET AL.

and for the tufA gene were all found to be maximal at the beginning of the light period (51, 350). However, EF-Tu mRNA decreased to almost undetectable levels in the second half of the light period. Activity of the pea chloroplast EF-G, encoded by a nuclear gene, is also light regulated but at the level of translation (1). Termination

Termination of translation in bacteria involves the hydrolysis of peptidyl-tRNA and release of the completed protein from the ribosome when the ribosome reaches one of the three termination codons (261). Termination requires the action of two release factors, RF-1, which is specific for UAA and UAG, and RF-2, which is specific for UAA and UGA. A third release factor, RF-3, stimulates the activities of RF-1 and RF-2. The same three codons are used for translation termination in chloroplasts (35, 36), with UAA being by far the most frequent (70% in land plant sequences surveyed by Bonham-Smith and Bourque [36]) and UGA being rare (9%). UAA is also overwhelmingly preferred as the stop codon in Chlamydomonas chloroplast genes (247). Bonham-Smith and Bourque (35) noted that UGA was never used as a stop codon in Marchantia chloroplast genes and proposed that a modification of the 16S rRNA in this species prevents recognition of UGA as a termination signal. No reading frame with homology to any of the genes encoding bacterial termination factors has been identified in a chloroplast genome, nor has isolation of these factors been reported. Chloroplast tRNAs and Aminoacyl-tRNA Synthetases The properties of chloroplast aminoacyl-tRNA synthetases have been summarized by Steinmetz and Weil (593). These enzymes are encoded in the nucleus. Most are distinguishable from their cytoplasmic counterparts and will charge only chloroplast or prokaryotic tRNAs efficiently. These enzymes have unusually high molecular masses (75 kDa or greater) and can be found as monomers, homodimers, heterodimers, or heterotetramers depending on the enzyme. The structure and codon recognition patterns of chloroplast tRNAs and the organization of their cognate genes have been extensively reviewed elsewhere (397-399, 593, 616). Genes encoding individual chloroplast tRNAs are highly conserved in different species of land plants and are similar in structure and sequence (ca. 70% sequence identity) to prokaryotic tRNA genes but have low homology to those of eukaryotic cells. However, the 3'-terminal CCA triplets of chloroplast tRNAs are added posttranscriptionally, as occurs for all eukaryotic cytoplasmic tRNAs but for only about one-third of bacterial tRNAs. Isoaccepting tRNAs for a given amino acid are encoded by different chloroplast genes, but these tRNAs are charged by the same chloroplast tRNA synthetases. Some chloroplast tRNA genes are preceded by prokaxyotic-like promoter sequences, but such sequences are absent upstream of other chloroplast tRNA genes, which may thus possess alternative promoters, possibly internal to the coding region (227, 229, 616). The tobacco chloroplast genome contains 30 tRNA genes, 23 of which are single and 7 of which are duplicated in the inverted repeat. Rice has the same set of tRNA genes as tobacco, but the inverted repeat extends through the tRNAH1S gene, found in the single-copy region adjacent to the inverted repeat in tobacco. In liverwort there are 31 chloroplastencoded tRNA genes, with the extra gene being tRNAAxgCCG, but in Euglena gracilis there are only 27 (243, 558, 616).

Several chloroplast tRNAs have unusual features. TIwo different tRNAIle species are found in plant chloroplasts. The major species (tRNAIle1, encoded in the spacer between the 16S and 23S genes) recognizes the codons AUU and AUC, while a minor species (tRNAIle2) recognizes AUA. However, the gene encoding the latter tRNA contains a CAU anticodon, which normally would recognize AUG for methionine. One possible explanation is that the C residue is modified in some way posttranscriptionally. In E. coli the C of the homologous tRNA is modified to lysidine, a novel type of cytidine with a lysine residue, which allows it to recognize the AUA codon

(444).

One tRNAGlUuc has a special function in chlorophyll biosynthesis as well as participating in protein synthesis, while the other two species have a U*UG anticodon specific for glutamine and are converted from Glu-tRNA01n to GlntRNAGJn by a specific amidotransferase activity present in chloroplast extracts (398, 616). This mischarging mechanism has also been described in several gram-positive bacteria (398). The chloroplast genomes sequenced to date encode a typical initiator tRNA"cICAu, and all employ the three classical termination codons (UAA, UAG, and UGA). However, genes for tRNAs recognizing the codons CUU/C (Leu), CCU/C (Pro), GCU/C (Ala), and CGC/A/G (Arg) are absent from the chloroplast genomes of tobacco and rice. Since all 61 sense codons are used in the three sequenced land plant chloroplast genomes, this deficit in specific tRNAs requires that the tRNAs either be imported or be read by the "two-of-three" mechanism used in animal mitochondria (174, 716) or by four-way wobble (480). In the absence of import in Euglena chloroplasts, one of the last two mechanisms would have to pertain to seven of the eight codon families (243). In land plant chloroplasts, two-of-three or four-way wobble seems to be used for tRNAMIaUda,GC, tRNAProU*GC, and tRNAA`gICG, which can read respectively all four alanine (GCN), proline (CCN), and arginine (CGN) codons (488). The first two tRNAs contain a modified U (U*) in the anticodon. The problem of decoding the six leucine codons is solved somewhat differently. Two of the leucyl-tRNAs translate the UUA and UUG codons (488). The remaining tRNAeUUAM7G translates all four CUN codons for leucine apparently by a U * N wobble mechanism (489). In tobacco, rice, and liverwort, six of the chloroplast-encoded tRNA genes possess introns which must be removed from the primary transcript during processing (398). In tobacco these introns range in size from 503 bp (tRNAeuuAA) to 2,526 bp (tRNALYSJuu) (616). Many land plant chloroplast tRNAs are singly transcribed, although a cotranscribed, tricistronic tRNA gene cluster has been identified in tobacco (398) and the two tRNAs found in the spacer between the 16S and 23S rRNA genes are transcribed as part of the rRNA operon precursor (see below). Cotranscription of tRNA gene operons is the usual case in Euglena gracilis. RNase activities thought to be involved specifically in tRNA processing have been identified in chloroplast extracts (225, 226, 727).

PLASTID GENES FOR rRNAs Phylogenetic Conservation All chloroplast genomes examined contain genes for the 16S, 23S, and 5S RNAs of the chloroplast ribosome. Table 1 lists species for which sequences have been published. Chloroplast rRNAs are highly conserved at the sequence level and are most closely related to eubacterial sequences, which include those of cyanobacteria (210, 219, 236, 512, 709). For

VOL. 58, 1994

CHLOROPLAST RIBOSOMES AND PROTEIN SYNTHESIS

705

TABLE 1. Plastid and cyanobacterial rRNA sequences published or submitted to GenBank Taxon

16S rRNA Anabaena sp. Anacystis nidulans Antithamnion sp. Astasia longa Chlamydomonas moewusii

Chlamydomonas reinhardtii Chlorella ellipsoidea Chlorella kessleri Chlorella mirabilis Chlorella protothecoides Chlorella sorokiniana Chlorella vulgaris Conopholis americana Cryptomonas 1 Cyanidium caldarium Cyanobacteria (miscellaneous spp.) Cyanophora paradoxa Daucus carota Epifagus virginiana Euglena gracilis Euglena gracilis bacillaris

Glycine max Helianthus annuus Marchantia polymorpha Nanochlorum eucaryotum Nicotiana plumbaginifolia Nicotiana tabacum Ochromonas danica Ochrosphaera sp. Olisthodiscus luteus Oryza sativa Oscillatoria sp. Palmaria palmata Pisum sativum Porphyra purpurea Porphyridium sp. Prochloron sp. Pylaiella littoralis Pyrenomonas salina Sinapis alba Spinacia oleracea Spirodela oligorhiza Synechococcus lividus Zea mays 23S rRNA Alnus incana Anacystis nidulans Antihamnion sp. Astasia longa Chlamydomonas eugametos

Chlamydomonas frankii Chlamydomonas gelatinosa Chlamydomonas geitleri Chlamydomonas humicola Chlamydomonas indica Chlamydomonas iyengarii Chlamydomonas komma Chlamydomonas mexicana Chlamydomonas moewusii Chlamydomonas pallidostigmatica Chlamydomonas peterfii Chlamydomonas pitschmanii

GenBank accession no(s).

X59559 X03538; X00346, K01983 (partial) X54299 X14386 X15850 J01395, X03269 X12742, X05694, X03848 X65099 X65100 X65688 X65689 X16579 X58864 X56806 X52985 M63813, M63814 M62775, M62776 M64522, M64526, M64531, M64536 M19493 (partial) X73670 M81884, X62099 V00159, X12890, X05005, X70810 X00536 (partial) X07675, X06428, M37149 (partial) X73893 X04465 X76084 M82900, X70938 J01452, J01453, V00165, V00166, Z00044 X53183 X65101 M82860, X15768 X15901 X58359, X58360, X58361 (partial) Z18289 M37430 X51598 M16874, M16862, M30826 (partial) L07257, L07258 X63141 M21373, X14873, X14874 X55015 M15915, X04182 J01440, M21453 (partial) X00014, X00015 (partial) X67091, X67092, X67093 (partial) M10720, Z00028 M75722

X00512, X00343 (partial) X54299 (partial) X14386 Z17234 X68905-X68909 Z15151 X68891, X68892 X68921, X68922 X68893-X68898 X68886, X68886 X68927-X68929 X68910-X68912 X68913-X68918 X68899-X68904 X68887, X68888 Z15152

Reference(s)

356 333, 647, 697 384 569 140 137 722, 724, 725 281 281 280 280 279 702, 703 130 383 63 681-683, 692 548 287 395 435, 712, 714 217, 243, 529, 530, 549 152 110, 673 77 312, 472, 473 553 476, 729 560, 561, 645, 646 705 281 108, 109 265 698, 699 573 602 79 557, 617, 618 516 34 555, 660, 665 404, 405 382 502 56, 409 302 38 554 351 126, 334 384 569 200, 657, 658 658 658 658 658 658 658 658 658 658 658 658 658 Continued on following page

706

MICROBIOL. REV.

HARRIS ET AL.

TABLE 1-Continued

Chlamydomonas reinhardtii Chlamydomonas starrii Chlamydomonas zebra Chlamydomonas sp. Chlorella ellipsoidea Coleochaete orbicularis Conopholis americana Cryptomonas (F Cyanidium caldarium Cyanophora paradoxa

Epifagus virginiana Euglena gracilis Marchantia polymorpha Nicotiana tabacum Olisthodiscus luteus Oryza sativa Palmaria palmata Pisum sativum Pylaiella littoralis Spinacia oleracea Spirodela oligorhiza Zea mays 4.5S rRNA Acorus calamus Allium tuberosum Alnus incana Apium graveolus Codium fragile Commelia communis Conopholis americana Dryopteris acuminata Gossypium hirsutum Hordeum vulgare Jungermannia subulata Ligularia calthifolia Lycopersicon esculentum Marchantia polymorpha Marsilia quadrifolia Mnium rugicum Nicotiana tabacum Oryza sativa Osmunda regalis Pisum sativum Spinacia oleracea

Spirodela oligorhiza Triticum aestivum Zea mays

Reference(s)

GenBank accession no(s).

Taxon

J01398, X01977, X16687, X16686 X68889, X68890 X68919, X68920 X68923-X68926 M36158; X05693, X03848 (partial) X52737 (partial) X59768 X14504 (partial) X54300 (partial) M19493 (partial) M81884, X62099 X13310, X12890 M13809, X04465, X01647 J01446, Z00044 X15768 (partial) X15901 Z18289 M37430 X61179, M21373 (partial) M21453, X04977 (partial) X00012, X00013 (partial) Z00028, X01365 M36166 M35406 M75719 M35404 M35276 M35407 X58863 X01523 X63124 M35405, M57605 M13808 M36165 M33098 X04465, M13809 X51641 M35056 J01446, V00161, J01891, J01451, X01277, Z00044 X15901 X51978 M37430 M10757, X04977 J01439 M10541 M19943, Z00028, X01365

346, 521 658 658 658 726 394 703 128 384

287 435, 712, 714 549, 730 312, 473 560, 561, 627 108 265 573 602 405, 581 11, 409 304 148 31 738 284 738 176 738 703 623 440 80, 738 691 31 739 472, 473, 691 421 652 560, 561, 624, 625

265 421 602

11, 332 303 696 147, 148, 601

5S rRNA Alnus incana Anacystis nidulans Astasia longa Chlamydomonas reinhardtii Chlorella ellipsoidea Conopholis americana Cyanophora paradoxa Cycas revoluta Dryopteris acuminata Euglena gracilis bacillaris Euglena gracilis Ginkgo biloba Glycine max Gossypium hirsutum Jungermannia subulata Juniperus media Lemna minor Lupinus albus Marchantia polymorpha

M75719

X00343, X00757, M23834 X14386 X03271 X04978 X58863 M32451, M33030 X12787 X00758 X00536 K02483, X12890 X51979 X16736 X63124 X00667 X02714 X65030

X00666, X04465

284 94, 125

569 550 724 703 411, 412 743 629 152 243, 296 421 23 440 728 663 144 327 472, 473, 728 Continued on following page

VOL. 58, 1994

CHLOROPLAST RIBOSOMES AND PROTEIN SYNTHESIS

707

TABLE 1-Continued Taxon

Nicotiana tabacum Oryza sativa Picea excelsa Pelargonium zonale Pisum sativum Porphyra purpurea Porphyra umbilicalis Prochloron sp. Pylaiella littoralis Spinacia oleracea

Spirodela oligorhiza Synechococcus lividus Vicia faba Zea mays

Reference(s)

GenBank accession no(s).

J01451, M10360, M15995, X01277, Z00044 X15901 X63200 X05551 M37430 L07259, L07260

144, 265 421 146 602 516 664 380 580 112, 303 111, 663 147,

K03159, X02637 X61179 V00169, X05876 J01439 X02731 M19943, Z00028

example, primary sequence homology is generally over 70% for chloroplast or cyanobacterial 16S rRNAs compared with that of E. coli and greater than 80% for chloroplast 16S rRNA compared with cyanobacterial 16S rRNAs. Gray (219) recognized eight noncontiguous conserved primary sequences in 16S rRNA interspersed among nonconserved sequences. The predicted secondary structures of these molecules are even more conserved, and virtually all of the approximately 45 helices postulated for the E. coli 16S rRNA (62, 462) are present in chloroplast 16S rRNAs of Euglena gracilis, Chlamydomonas species, tobacco, and maize (232, 512; also see below). Compensating base substitutions are often seen on the complementary sides of predicted stem structures, strengthening the supposition that these structures are functional in vivo. Because of this high degree of structural conservation, rRNA genes have found extensive use in phylogenetic studies (78, 219, 232, 235, 236, 342, 710). Comparative analyses of 16S (54, 210) and 5S (144, 380, 663, 743) rRNA sequences support both the probable origin of chloroplasts from endosymbiotic cyanobacteria and the hypothesis that land plants derive from one branch of chlorophyte algae. Van de Peer et al. (665) have compared 16S and 18S sequences from eukaryotic, archaebacterial, eubacterial, plastid, and mitochondrial ribosomes. Although their analysis focused largely on mitochondrial origins, their data also support the common ancestry of cyanobacteria and plastids. General Characteristics of Chloroplast rRNA Gene Organization As in the eubacteria, chloroplast rRNA genes are normally arranged in an operon transcribed in the order 16S-23S-5S (Fig. 2) (114, 320). In land plants, including some but not all ferns, approximately 95 nt homologous to the 3' terminus of the E. coli 23S molecule constitutes a 4.5S rRNA molecule, separated from the remainder of the 23S gene by a transcribed spacer, whereas in prokaryotes, all algae so far examined, mosses, and the liverwort Marchantia polymorpha, the equivalent sequence is part of the 23S gene (47, 320). In C. reinhardtii, the sequences homologous to the 5' portion of the 23S gene of bacteria and plants are divided into 7S and 3S rRNAs, separated by short spacers that are removed from the precursor rRNA posttranscriptionally (137). The large subunit rRNA of C. eugametos comprises species (a and ,B) equivalent to the C. reinhardtii 7S and 3S rRNAs and two larger species (ry and 8) which together are equivalent to the remainder of the 23S molecule (656).

560, 561, 624-626

491, 492

113 601

16S rRNA The secondary-structure model of 16S rRNA based on comparative sequence analysis (231, 232, 236, 449, 463, 468) suggests a functional division into distinct 5', central, and 3' domains, corresponding in E. coli to residues 26 to 557, 564 to 912, and 926 to 1391, respectively, followed by a "3' minor domain" from ca. 1401 to 1542 (Fig. 3; for a numbered E. coli sequence diagram in similar format to the tobacco sequence shown in Fig. 3, see references 231 and 235). Each of these domains comprises helices and loops whose secondary structure is phylogenetically conserved (219, 236). Models for the tertiary structure of the E. coli 30S subunit have been constructed based on studies of RNA-RNA and RNA-protein cross-linking, immunoelectron microscopy, and neutron diffraction (58-61, 463, 465, 596). Functional analyses involving mutants, binding of tRNA and antibiotics, and assembly of ribosomal proteins with RNA in vitro indicate that codonanticodon recognition involves the 3' domain and terminal 3' minor domain. Three regions of the 16S molecule (E. coli nt 518 to 533, 1394 to 1408, and 1492 to 1505) that show a particularly high degree of primary sequence conservation appear to have tertiary interactions related to decoding (468). tRNA bound in the A site interacts specifically with the 3' domain and with residues in the "530 loop" (see reference 465 for review), whereas P-site-bound tRNA protects five sites in the central and 3' domains that are proposed to be clustered in tRNA

168

lie Al

23S

_uI

tRNA

165

5s

IE cyanobacteria, most algae 23S

I

4.5S 5S land plants

tRNA

165

IleAla

7S 38

23S 5'

23S 3' 5S C. relnhardtll

FIG. 2. Arrangement of the rRNA operons in land plants and algae, showing conservation of tRNAIle and tRNAMa within the spacer between the 16S and 23S genes and variation in the species that constitute the 23S molecule. In land plants, the tRNA genes are split by introns, whereas in all algae examined to date they are uninterrupted. The region corresponding to the 3' end of the eubacterial 23S rRNA is a separate 4.5S rRNA in angiosperms, gymnosperms, and some (but not all) ferns. Internal transcribed sequences and one or more introns interrupt the 23S genes of Chlamydomonas species (658).

708

MICROBIOL. REV.

HARRIS ET AL.

UA

a^o

GA

a

central domain AA a CMnGAA A c^^c

GUc 'aGGUGGCCUUUAAGGG-cCA

CGA

A U a-C A U -a ca0-C

spr

GO

0 U.G G-C C - GA U - AC 0-C

u a C-Q c-a ^Aa GA G G

U AA ACA cc ACCC GGCGOUGGA CU A AAGC AACCCUG0 I .II I .II II IIIIIlIi 11I1II1I1

UCGOGACC

CU

CUGCCGCCU

A UAA

GA UUUUUC

A

a

AaUUCUCCU~AA

A-U

C

-

A

3' domain

CAC

0. G~~~~~A

GU /.UAG

A0! A

helix 17

5' domain i

sr 0

5,

nr

A

GUC ACOGGAAGUG

I - I I In I I I I- a COO U0ACCUUUoU

a G a C

helix 6

3' minor domain

tobacco 16S rRNA

FIG. 3. Secondary structure of tobacco 16S rRNA, showing the major functional domains and sites of antibiotic resistance (Table 2; sr, streptomycin; spr, spectinomycin; nr, neamine/kanamycin). Sequence from the Ribosomal Database Project, courtesy of Robin Gutell.

the tertiary structure. Many of the same sites, which are all in highly conserved regions of the 16S molecule, also interact with antibiotics that block protein synthesis at the level of the 30S subunit (424, 467, 537; also see below). The principal regions in which the secondary structures of

chloroplast 16S rRNAs deviate from the E. coli model are in the 5' domain between nt 198 and 220 (numbering according to the E. coli sequence [462]), where chloroplast rRNAs have a shorter helix 10 than E. coli does, and between 455 and 477, where E. coli has a well-defined helix (the upper part of helix

CHLOROPLAST RIBOSOMES AND

VOL. 58, 1994

17 of Brimacombe et al. [59]) that is lacking in cyanobacteria and chloroplasts. Helix 6 is also shorter in chloroplast and cyanobacterial 16S rRNAs than inE. coli. Raue et al. (512) noted eight regions where secondary structure is conserved but primary sequence is highly variable. These eight regions are also sites of variation in secondary structure when the 16S rRNAs of chloroplasts and eubacteria on the one hand are compared with small-subunit rRNAs of mitochondria and eukaryotic cytoplasmic ribosomes on the other. 23S rRNA

The secondary-structure model for the E. coli 23S rRNA published by Gutell and Fox (233) consists of six domains comprising a total of 95 helices (Fig. 4; for a numberedE. coli sequence diagram in similar format to the tobacco sequence shown in Fig. 4, see references 234 and 235). Domain V is the principal site of tRNA binding to the 50S subunit (60, 426). The central loop of this domain is involved in the peptidyltransferase center and is the site of mutations conferring resistance to erythromycin, lincomycin, and chloramphenicol (see below). Some tRNA interactions are found in domains II and IV. When bound to the P site, tRNA also interacts with the 3' terminus of the 23S molecule (426). EF-G binds specifically to position 1067 in the 23S molecule, a region identified with GTP hydrolysis (465). EF-Tu protects residues in the 2660 loop. Rau6 et al. (512) identified 18 variable regions in 23S RNAs based on comparisons of eubacterial, organelle, archaebacterial, and eukaryotic large-subunit rRNAs, including the cyanobacterium Anacystis nidulans and chloroplast 23S rRNA from Chlorella ellipsoidea, Marchantia polymorpha, tobacco and maize. Of these 18 variable regions,5 are significantly different in chloroplasts compared with E. coli, while in the remaining 13 regions, chloroplast rRNAs resemble those of eubacteria but may differ from those of archaebacteria, mitochondria, and eukaryotic cytoplasmic ribosomes. Somerville et al. (581) have published a secondary-structure map of the 23S rRNA from the brown alga Pylaiella littoralis which resembles the cyanobacterial (Anacystis) molecule much more closely than it resembles those of land plants or green algae. Cladistic analysis of the 23S rRNA sequence produced a tree in which cyanobacterial and plastid sequences were clearly delineated from all other eubacterial sequences and in which the chromophyte algae (as represented by Pylaiella littoralis) and Euglena gracilis formed a common branch. In domain I, cyanobacterial and chloroplast 23S rRNAs lack helix 8 of E. coli (nt 131 to 148, variable region V1) and have an insertion between helices 13 and 14 (E. coli nt 271 to 365, variable region V2) which can be folded into a helix (512). In domain II, variable regions V4 and V7 (nt 636 to 655 and 1020 to 1029, respectively) are highly conserved among eubacteria and chloroplasts, while 3 nt (nt 931 to 933) in E. coli V6 are replaced by a loop of 5 to 20 nt in chloroplasts. Region V8 (nt 1164 to 1185) is conserved in E. coli, Anacystis nidulans, and most chloroplast 23S rRNAs but is the site of a possible 243-nt intron in Chlorella ellipsoidea (726). Gutell and Fox (233) have suggested that this insertion may actually be a part of the rRNA rather than the only known instance of an intron inserted in a variable rRNA region. Domain III comprises variable regions V9 to V12, of which Vii (E. coli nt 1521 to 1542) is the most diverse in chloroplast 23S rRNAs. Some (but not all) chloroplast rRNAs have lost part of helix 54, and helix 55 in Chlorella ellipsoidea and Z. mays contains insertions compared with E. coli; however, in other chloroplast genomes, this helix is similar in size to that of E. coli. Domain IV shows strong conservation among eubac-

PROTEIN SYNTHESIS

709

teria and plastid rRNAs, as does domain V. The break between the 23S rRNA and 4.5S rRNA of land plants occurs in domain VI. 5S rRNA Their short length and relatively high degree of evolutionary conservation have madeSS rRNA molecules frequent subjects for phylogenetic studies (see e.g., references 117, 270, 580, 663, 664). They have also proved useful for computer modeling of secondary and tertiary structure, including chemical reactivity and accessibility of bases, and possible protein binding (67, 524, 525, 693). A numbering scheme applicable to both prokaryotic and eukaryoticSS rRNAs, proposed by Erdmann and Wolters (157), defines five loops (a to e) and five helices (A to E). In a compilation of sequences in the Berlin RNA Databank, Specht et al. (583) included representations of the common secondary structure of eukaryotic and prokaryotic SS rRNAs, which are differentiated into five structural groups primarily on the basis of variability in one (D) of the five helices. PlastidSS rRNAs are grouped in this classification with those of eubacteria and land plant mitochondria (mitochondria of other taxa lackSS rRNA). Plastid and cyanobacterial SS rRNAs are distinguished from those of most other eubacteria and mitochondria by a single-base insertion in helix C and a deleted base in loop c (157). Of the 121 nt of the typical SS rRNA, 110 are identical in nearly all angiosperms and gymnosperms, 73 are conserved in ferns and liverworts as well, and 29 are identical in all plastids so far sequenced with a few singular exceptions. The colorless flagellate Astasia longa and the red alga P. umbilicalis are somewhat divergent compared with Euglena gracilis and P. purpurea, respectively; 4 nt are altered in one or both of the two parasitic plants Conopholis americana and Epifagus virginiana compared with all other angiosperms; and the sequence submitted to GenBank for cotton, Gossypium hirsutum (440), is missing 2 nt but is otherwise identical to that of tobacco in all but two residues. Vogel et al. (671) reported that SS rRNA from spinach chloroplasts could be incorporated into biologically active 50S ribosomal subunits assembled in vitro from Bacillus stearothermophilus proteins and 23S rRNA. Introns in rRNA Genes

A survey of 23S rRNA genes from 17 Chlamydomonas species representing most of the taxonomic groups defined on morphological and biochemical grounds (159, 538) revealed a total of 39 group I introns inserted at 12 different positions, some of which were unique to Chlamydomonas species (656658). However, no correlation was found between intron distribution and a phylogeny for these 17 species based on primary sequence of their 23S genes. Most of the intron insertion sites identified in this study are in highly conserved regions of the genome, which tend to be exposed in the assembled ribosome. This is also true of the single intron in the 16S gene of C. moewusii, which lies within the 530 loop, a part of the translational fidelity domain. In contrast, internal transcribed spacers, which have also been identified in rRNA genes of bacteria and organelles, occur within regions of variable primary sequence and secondary structure (224). When these sequences are processed out of the pre-rRNA molecule, the mature sequence is not religated, resulting in a fragmented rRNA. Three internal transcribed spacers, found at equivalent positions in the Chlamydomonas taxa studied by Turmel et al. (656-658), result in fragmentation of the 23S rRNA into four mature rRNA species, ao, ,, -y, and &. The single group I intron in the 23S gene of C. reinhardtii

710

MICROBIOL. REV.

HARRIS ET AL.

AAAUUac cu

A

AB

BAA

U-A

G eUAAGAAG I II I I *I

VI I

tcc A AUUUC a

A

I II I

I

-

AcUoUUUC%cccuu OGAAUBCAAA

A

u

c

c

GTPase center

Ue

c

a UUA UCOB G UA ABCC a

OAA

Cu AGGCGC ,,|"III 0UC0CB

_UBAAG

A U-A U A

u-C

A A AU aAU u u

-c u~~~~~~~~~ A~~~~~~~~

a

u

0-C

0-C O-C A-U -1 B

6-C A A 0

U0 ViOA BA% GOCC BUC CC0ABzG0ACQ%%.IA ZqGCCCCCUUBUUBa °scU r Buu CUC AcC IIIIII,I*I I I1111 I

AA CA

V9

0AU aa4AusUC UBUC GBOA AAABBCPUA

Aa CCBA CUAAAU

UG.LCACAG

A

ACC

A

A AAAf GU CAU

V7 C

II

UUBG

111111-

A

'ac GUUUA u AB-U AAC 1G1 A[q OCCUCCU'

AAC

U

A

UB' C-B

V12 III ~~~domain

C A A .UE C CA

_sC-B

c

U AA uuC-Gc UI-C _ _Ca

I f 3'fn bAA

V6

domain I V3

domain 11

e AA

Aa

CAA c u C=U a-B

AA

U-A

B-C C-B

tobacco 23S rRNA, 5'

V2

AAC-BGBUUBU a oil

11|1

A

-QU

11,

AAA'

0-2 ._

FIG. 4. Secondary structure of tobacco 23S rRNA, showing major functional domains and sites of antibiotic resistance (Table 2). Variable regions are numbered according to the system of Raue et al. (512). Sequence from the Ribosomal Database Project, courtesy of Robin Gutell.

VOL. 58, 1994

CHLOROPLAST RIBOSOMES AND PROTEIN SYNTHESIS

711

UC U * G C-a C-C

UCG-C

a-C A AG

C uu V5 AICUA

domain IV

LL

C,1

3UGG

A

A -C

G-CA

domainIV

UCACCTf

C

CU U-

UAGAG

c-a U-A U-A

C-a c-a

G GGA GA

a-Cu G GA

0U u

U-A

V13

A

A

u.G

tu

a

V16

a

G-CUA Accuu U

UCUCGGAC

G-CC

C I I * I°I -II G

U

-UGACAG

A

G0-Cc aC U-A -

c-a

C

-

a

G aa A CCA o A A A A cC-O cU G

U-A

GC

Gu

UC- aU GOA AG CG U-A C

I-C

U

AA- U

Uuca c U:

U-A

U

-c C II C a G AU C 1 G*U GU c -Ga U G A-AA AU : : C U-A U-A0 G G-CUA U-A CG-C U AGUC CAa U-AA A A aUu A A C AUA GaO A ACG GU

aUUC

AC UGC. I III ACG

aC

G_.C

CC

AC-G GAU AU

C

AA C

A

peptidyl

GGCUGAUCUUCCCCACCU It C 1I. 1i1i1iI

transferase caUC GAAC A a

domain VI

Ac

UU

u

C

u

3A

Ac

cA

UC

5' 4.5S

A

UC

CAUGG

UGUGGCUG U AAGCCACC

UUAU A-UA

BC-U A

CAUU

-U

C-G

a-*U

'A

UtU G0GACCUUGUA U

C A

U-A

tobacco 23S rRN IA, 3

a

Ua-C C-a

A

A

c

CA- U

U%GCCGA Gc

CC--G

U. a

C=a

ACGGCGAG 11111

A

A

aG

G

A

G-C

_A

a

~~~U C

Aa-c

u

GUaA a 0C -GU

Ac

.A 'C

C-G C=G C' CU UUG

Aa

C0

A

U

GA

A-U

U

aG

C-C U-A

CGU

(522) is mobile (142) and encodes a double-stranded nuclease (I-CreI) which has been purified and shown to have a 19- to 24-bp recognition sequence in this gene (143, 638). The enzyme makes a 4-bp staggered cut just downstream of the intron insertion site and will tolerate single- and even multiplebase-pair changes (143). This intron can undergo autocatalytic

splicing in vitro (637). The ac-20 nuclear gene mutant of C. reinhardtii, which was initially characterized as deficient in chloroplast ribosomes (see reference 247 and references therein), has been found to accumulate unspliced precursor rRNA molecules, as well as unspliced precursors of the chloroplastencoded psbA gene (258, 259).

712

HARRIS ET AL.

The 23S rRNA gene of C. eugametos, a species now thought to be only distantly related to C. reinhardtii, contains six group I introns and three internal transcribed spacers (657). One optional 955-bp group I intron in the 23S gene of C. eugametos also appears to be mobile and is transmitted to all progeny of crosses with the interfertile species C. moewusii, which lacks this intron (347). The 402-bp group I intron in the 16S rRNA gene of C. moewusii likewise can be transmitted in crosses to

isolates of the sibling species C. eugametos that lack this intron (140). Transmission of these introns is often accompanied by coconversion of flanking DNA polymorphisms. The mobile intron in the 23S gene encodes a double-stranded DNA endonuclease activity (I-CeuI) which has a 19-bp recognition site centered around the insertion site. I-CeuI produces a staggered cut 5 bp down from the insertion site (200, 406, 407). The 23S rRNA gene of C. humicola has a group I intron, ChLSU-1, inserted at a site in the peptidyltransferase loop and encoding a putative 218-amino-acid endonuclease (96). Introns have been found at this site only in a few Chlamydomonas species (658). Turmel et al. (658) discuss the alternative possibilities for transfer of group I introns from one site to another within a genome. Intron-encoded endonucleases could effect such a transfer at the DNA level (139); alternatively, a reversal of self-splicing followed by reverse transcription of the recombined RNA could occur, followed by integration into DNA by homologous recombination. The latter mechanism requires only a short target site that can pair with the 5' intron sequence called the internal guide sequence (718) and would be consistent with the position of intron insertion sites in exposed rRNA regions in the ribosome in the Chlamydomonas species examined by Turmel et al. (658).

The 16S-23S Spacer The spacer regions between the 16S and 23S rRNA genes in chloroZlasts and cyanobacteria contain tRNAIleGAU and as do the E. coli rmA, mD, and rmH operons tRNA aUGcm (301, 436, 697, 735). In E. coli and cyanobacteria, the 16S-23S spacer is short (