Borrowed proteins in bacterial bioluminescence

4 downloads 0 Views 981KB Size Report
DENNIS J. O'KANE*, BONNIE WOODWARDt, JOHN LEE*, AND DOUGLAS C. PRASHERtt. *Department of Biochemistry, University of Georgia, Athens, GA ...
Proc. Nati. Acad. Sci. USA Vol. 88, pp. 1100-1104, February 1991 Biochemistry

Borrowed proteins in bacterial bioluminescence (cloning/lumazine protein/riboflavin synthetase/yeilow fluorescence protein)

DENNIS J. O'KANE*, BONNIE WOODWARDt, JOHN LEE*,

AND

DOUGLAS C. PRASHERtt

*Department of Biochemistry, University of Georgia, Athens, GA 30602; and tDepartment of Biology, Woods Hole Oceanographic Institute, Woods Hole, MA 02543

Communicated by W. D. McElroy, October 22, 1990

"proto-bioluminescent" systems would be increased by mutation and natural selection if a luminescence capacity provided a survival value (12). A corollary to this postulate is that wavelength-shifting in vivo emitter proteins would evolve following the existence of an efficient Lase. Perhaps the genes encoding LumP and YFP were borrowed, in the sense of being appropriated or derived, from an ancestral gene that encoded a protein having a different function, but with the capacity to bind fluorescent ligands. In this regard, Lum is bound by the two subunits of riboflavin synthetase (EC 2.5.1.9). The riboflavin synthetase v subunit (RSjB) synthesizes Lum, but its sequence (14) shares no similarity with the N-terminal regions of LumP or YFP. Two molecules of Lum are tightly bound by the riboflavin synthetase a-subunit (RSa), which synthesizes riboflavin. Despite some differences, such as the number of ligand sites per monomer (15, 16), the similarity of monomer molecular weights of LumP, YFP, and RSa (8, 17, 18) and the binding patterns for modified ligands and inhibitors by LumP (19) and RSa suggest an evolutionary relationship. The recent availability of the amino acid sequences of RSa (18) and YFP (20) and the determination of the primary structure of LumP in this paper§ allows an evaluation of the relatedness of these proteins and permits predictions about apoprotein/ligand interactions, as well as speculation about the evolution of these emitters.

A library of Phlotobacteium phosphoreum ABSTRACT DNA was screened in A2001 for the lumazine protein gene, using two degenerate 17-mer oligonucleotide probes that were deduced from a partial protein primary sequence. The lumazine protein gene was localized to a 3.4-kilobase BamHI/EcoRI fragment in one clone. The fragment contained an open reading frame, encoding a 189-residue protein, that had a predicted amino acid sequence that concurred with the partial sequence determined for lumazine protein. Considerable sequence similarity was detected between lumane protein, the yellow fluorescence protein from Vibriofischeri, and the a subunit of riboflavin synthetase (EC 2.5.1.9). A highly conserved sequence in lumazme protein corresponds to the proposed lumazine binding sites in the a subunit of riboflavin synthetase. Several secondary structure programs predict the conformation of this site in lumazine protein to be a 13-sheet. A minimal model with three, interactions between the ligand and this 13-sheet structure is proposed, which is consistent with the results of NMR and ligand binding studies.

Bioluminescence emission accompanies oxidation of aliphatic aldehydes by bacterial luciferase (Lase), a flavoprotein monooxygenase (EC 1.14.14.3). Decomposition of a Lase-bound aldehyde-peroxyflavin adduct is postulated to form 4a-hydroxy-FMN in the excited state (1), which results in light emission upon relaxation to the ground state (Amax = 490-495 nm). However, in vivo bioluminescence can be either at longer or shorter wavelengths than that observed from purified Lase (2-9). Yellow light emission from Vibrio fischeri strain Y1 (AX = 542 nm in vivo; Amax = 495 nm with purified Lase) is attributed to the interaction of Lase with yellow fluorescence protein (YFP) (7, 8). YFP contains noncovalently bound, fluorescent FMN (8, 9). Bioluminescence from some strains of Photobacterium phosphoreum and Photobacterium leiognathi is shifted to higher energy levels (Ames = 486-475 nm) than those observed with the purified Lase (Ama., 495 nm) (4-6, 10). These bacteria utilize a different protein emitter, which has a noncovalently bound fluorescent pteridine: 6,7-dimethyl-8-(1'-D-ribityl)lumazine (Lum) (11). The corrected fluorescence emission spectrum (Amax = 475 nm) of the purified in vivo emitter protein (lumazine protein; LumP) and the corrected in vivo bioluminescence emission of the most hypsochromically shifted Photobacterium strains are identical (5, 6). Addition of physiologically relevant concentrations of LumP to Lase in vitro alters the kinetics, increases the quantum efficiency, and hypsochromically shifts the bioluminescence emission

METHODS Cleavage of LumP and Amino Acid Sequencing. LumP from P. phosphoreum A13 was purified to homogeneity by com-

binations of liquid chromatography and HPLC procedures (10). LumP was treated with 2-vinylpyridine (21). CNBr cleavage (30-fold molar excess in 70% formic acid for 18 hr at room temperature) produced one high molecular weight fragment (Mr 15,000, designated PE-CNBr). LumP (100 ttg in 0.1 ml) was also digested with 2% (wt/wt) L-1-tOsylamido2-phenylethyl chloromethyl ketone-treated trypsin or Na-(ptosyl)lysine chloromethylketone-treated chymotrypsin at room temperature for 18-24 hr in 20 mM NH4HC03 at pH 8.25. The resulting peptides were dissolved in 0.1% trifluoroacetic acid and were separated on a 0.21 x 22 cm column of Aquapore RP 300 (Brownlee Lab). Peptides were eluted with an increasing gradient of CH3CN (0-70%o vol/vol) containing 0.1% trifluoroacetic acid at 50 gl/min. Sequences were determined by using an Applied Biosystems model 470A gas-phase sequenator. Cloning and DNA Sequencing the LumP Gene. Genomic P. phosphoreum A13 DNA fragments (20-30 kilobases) from a Sau3A partial digest were ligated into BamHI-digested A2001

(5, 6).

How LumP and YFP originated for use in bioluminescence is not known. Seliger (12, 13) proposed that bioluminescence evolved from preexisting oxidative enzyme systems after evolution of photoreceptors. The quantum efficiency of such

Abbreviations: Lase, bacterial luciferase; YFP, yellow fluorescence protein; Lum, 6,7-dimethyl-8-(1'-o-ribityl)lumazine; LumP, lumazine protein; RS(3, riboflavin synthetase ( subunit; RSa, riboflavin synthetase a subunit. tTo whom reprint requests should be addressed. §The sequence reported in this paper has been deposited in the GenBank data base (accession no. M38364).

The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.

1100

Proc. Natl. Acad. Sci. USA 88 (1991)

Biochemistry: O'Kane et al.

1101

conservative amino acid replacements were defined as those requiring a minimal single base change per codon. Structurally conservative replacements were defined as those with a value of 5 in the structure-genetic matrix of Feng et al. (28). Secondary structure was predicted using three programs: two based upon the Chou-Fasman criteria (29, 30) and the third using the Garnier-Osguthorpe-Robson algorithm (31) (IntelliGenetics).

"arms" (22). The recombinant frequency was 10Oo0, determined by using strains Q358 and Q359 (23). Two degenerate 17-mer oligonucleotide probes were designed by using nucleotide sequences predicted to encode amino acid residues 1-6 (Met-Phe-Lys-Gly-Ile-Val) and residues 60-65 (Tyr-PheAsp-Ile-Asp-Gln). The sequence of the LumP1-6 probe was 5'-AC(A/G/T) ATN CC(T/C) TT(A/G) AAC AT-3' and the sequence of LumP60-65 was 5'-TT(A/G) TC(A/G/T) AT(A/G) TC(A/G) AA(A/G) TA-3Y. The genomic A2001 library was screened with the pooled, electrophoretically purified, 32P-labeled probes at 51°C according to the method of Wood et al. (24). Eighty-five signals were identified upon screening 104 phage from the library. The DNA from three isolates (AlA, A63A, and A75B) continually yielded strong hybridization signals during their purification. Isolate A75B contained a 3.4-kilobase BamHI/EcoRI fragment that hybridized to both probes. This fragment was subcloned into M13mpl8 and M13mpl9 for DNA sequence analysis. A series of unique oligonucleotide primers, based on DNA sequence within the fragment, was used to prime the reaction. DNA sequencing was performed by using an altered T7 DNA polymerase (Sequenase version 2.0; United States Biochemical) in the dideoxy chain termination method (25) with deoxyadenosine 5'-[a-[35S]-thioltriphosphate (>1000 Ci/mmol; 1 Ci = 37 GBq; Amersham). Sequence Alignment and Secondary Structure Predictions. Amino acid sequences were aligned (26) using a gap penalty of 5 and 200 randomized sequence pairs for comparison. The standardized score represents the number of standard deviations by which the alignment score exceeded the mean score for the randomized pairs and is significant if >3.0. Normalized alignment scores were also calculated (27). Genetically

RESULTS Forty-nine residues were sequenced directly from the N terminus of LumP. The single internal methionine was located at residue 42. The PE-CNBr sequence overlapped the N-terminal sequence by 7 residues and extended the sequence to residue 75 (Fig. 1). A number of smaller tryptic peptides were obtained (numbered in order of elution from the HPLC column). Those that yielded unambiguous sequences are shown in Fig. 1. One tryptic peptide (T4) produced a major sequence (600 pmol first residue) and a minor sequence (T4m) that was discernible at the 100-pmol level. Peptide T7 was prepared twice from two independent digestions to verify the presence of the tryptophan residue. This peptide was produced at an unexpected cleavage site (C terminus of Leu). Three chymotryptic peptides overlapped with tryptic peptides, resulting in sequencing =75% of the amino acid residues in LumP (Fig. 1). Nucleotide Sequence of the LumP Gene. The entire BamHI/ EcoRI DNA fragment, which hybridized with both oligonucleotide probes, is 3403 nucleotides long and contains the open reading frame encoding LumP nearly centered. The methionine initiation codon is located at nucleotides 1728-

1600

1700

1800

ATG TTC AM GGT ATA GTT CA6 GGC 6TT GGA ATA ATT AM AAA ATC TCA AM MT GAT GAT ACC CM AGA CAT G6T ATT ACT TTT CCA AAA F K I6 I V Q 6 V G6 I Kl K I S K N D D T Q RH 6 I T F P K

N

30

DIRECT SEQUENCING FROM AMINO TERMINAL TO RESIDUE 49 1900 GAT ATA TTG GAC TCA GTT GM AAA GAT ACT GTC ATG CTC GTA MT GGC TGC TCA GTA ACT GTT GTC CGC ATC ACT GGT GAT GTT GTT TAT O I L D S V E K 0 T V M L V N 6 C S V T V V R I T G D V V Y

60

PE-CNBr TO RESIDUE 75

TTC GAT ATA GAT CM GCA ATT MC ACT ACA ACC TTT AGG AM TTA GM GTC GGT MC AM GTA MC TTA GM GTG CGT CCA GGA ITT GGC L E V R P G F G F D I D Q A I N T T T F R K L E V 6 N K I . C8T5

90

CID 2000

TCA CTC CIT GGG AM 6GT GCT TTA ACT GGA AAT ATA MG GGT GTC GCT ACT 6TT GAT MT ATT ACT GM GM GAA GAT CIT CIT AM 6TT I T E E E 0 L L K V S L L 6 K G A L T 6 N I K G V A T V ON 120 1 L T4m T4 T9 2100 TAT ATT AM ATC CCT AAA GAC CTA ATC GAA AAT ATT TCA TCA GAA GAT CAT ATT GGA ATT AAT GGT GTA TCT MT TCT ATT GAA GM GTT Y I K I P K ID L I E N I S S ED H I G I N 6 V S N S I E E V C9

150

...

2200

TCC MT GAT ATT ATT T6C ATT MT TAT CCA AMA MT TTA TCT ATT ACC ACT MC CTC GGC ACT TTA GM ACA GG6 TCT GM 6TT MC GTA T N L G T L E lT S E V N V S N D I I C I N Y P K N L S I T12

180

2300

GM ACA TTA MT GTA TCA MT GAA TGG TM TAAAAACMCTTATAGATAC E

T

N

L I

V

N E W

S T7

189 0

FIG. 1. Nucleotide sequence of the LumP gene and the predicted amino acid sequence of LumP. Sequenced tryptic (T) and chymotryptic (C) peptides are indicated beneath the predicted sequence. Numbers above the nucleotide sequence indicate distance from the BamHI restriction nuclease site. Numbers at the right indicate the amino acid residue position.

1102 BamHI

Biochemistry: O'Kane et al. BslEII

Availl

I

I 1000

1

BgIll

I

2000

Proc. Natl. Acad Sci. USA 88 (1991)

Nrul

LI

3000

EcoRI

A YFP

1 MFKGIVEGIGIIEKIDIYTDLDKYAIRFPENMLNGIKKESSIMFNGCFLTVTSVNSNIVW

LumP

1 MFKGIVQGVGI IKKISKNDDTQRHGITFPKDILDSVEKDTVMLVNGCSVTVVRITGDVVY

60

RSa

1 MFTGI IEETGTI ESMKKAGHAMALTIKCSK ILEDVHLGDSIAVNGICLTVTDFTKNQFT " on a m - m mm * a *

59

****** **** *

3403

***

ORF2

LumP

FIG. 2. Restriction map of the A75B BamHI/EcoRI fragment containing the LumP gene. Directions of transcription of the open reading frames are indicated by the large arrows.

1730 with respect to the BamHI restriction site (Fig. 1). The open reading frame encodes a protein of 189 amino acids before a TAA stop codon (nucleotides 2296-2299). A potential ribosome-binding site (32) is located 7 nucleotides upstream of the initiating codon. Potential -10 and -35 promoter sites were identified by visual inspection. Complete agreement was found between the partial amino acid sequence determined by gas-phase sequencing and the amino acid sequence predicted from the nucleotide sequence (Fig. 1). A second open reading frame begins 645 nucleotides upstream from the LumP gene and is transcribed in the opposite direction (Fig. 2). It continues to the BamHI site and can encode 360 amino acids of a protein. The predicted amino acid sequence is homologous with luxC of Vibrio harveyi (33, 34). Sequence Similarities. A search of protein data banks (EMBL Release 19.0, GenBank Release 61, PIR Update 21.0, and Swiss-Prot Version 12) revealed no significant sequence similarity between LumP and other proteins; however, significant sequence similarity was found with two proteins not yet found in the data banks: RSa and YFP (18, 20). The appropriate statistical parameters are collected in Table 1. A 9- or 10-residue deletion near the C terminus of LumP is predicted, but the precise position of this gap is uncertain (compare the alignment in Fig. 3A with that in Fig. 3B). No significant sequence similarity was detected between LumP and RSj3 (Table 1). Slightly greater statistical significance was Table 1. Alignment parameters for protein sequences Conservative replacements* Gaps Alignment Identical Struc- per 100 scores residues Genetic tural residues SSt NASt Comparison 51 75 32 1.6 9.2 230 LumP/RSa 65 36 36 3.3 1.0 75 LumP/RS/3 69 78 38 1.1 20.7 339 LumP/YFP 57 66 23 1.5 9.7 255 YFP/RSa ND 26 ND 4.1 1.9 36 YFP/RS,8 21 48 22 2.1 7.6 ND N-LumP/C-LumP 28 43 19 3.1 7.2 ND N-RSa/C-RSa 21 39 14 2.1 4.6 ND N-YFP/C-YFP 12 29 37 3.1 8.0 ND N-LumP/N-RSa 40 36 18 2.1 15.9 ND N-LumP/N-YFP 29 31 2.1 N-YFP/N-RSa 10 7.3 ND 23 38 19 3.1 4.4 ND C-LumP/C-RSa 29 42 21 3.1 10.9 ND C-LumP/C-YFP 28 35 13 3.1 5.9 ND C-YFP/C-RSa 23 40 19 3.1 4.8 ND N-LumP/C-RSa 21 44 16 3.1 5.7 ND N-LumP/C-YFP 22 45 23 4.2 7.3 ND C-LumP/N-RSa 20 42 18 2.2 6.6 ND C-LumP/N-YFP 22 46 17 3.1 6.7 ND C-YFP/N-RSa 45 19 23 2.0 4.8 ND C-RSa/N-YFP ND, not determined; N, amino portion; C, carboxyl portion. *Conservative replacements are defined in Methods. tSS, alignment score for the protein sequence comparison minus the mean alignment score for the randomized sequences, divided by the standard error for the randomized sequences. iNAS, normalized alignment score.

YFP

*

* **

*

**

*

*

***

***

*

**

*

**a *

***

60

61 FDIFEKEARKLDTFREYKVGD RVNLG TFPKFGAASGGHILSARISCVASIIEIIENEDYQ 120 ***

***

*

**

***

*

*

* **

*

**

* * **

LumP 61 FDI DQAINTTTFRKLEVGN KVNLE VRPGFGSLLGKGALTGNIKGVATVDNITEEEDLL 118 * * * * ***** * * ** ** RSe 60 VDVMP ETVKATSLNDLTKGS KVNLE RAMAANGRFGGHFVSGHVDGTAEITRIEEKSNAV 118 a U a U aEU * * _ * *Ia YFP 121 QMWIQIPENFTEFLIDKDYIAVDGISLTIDTIKNNQFFISLPLKIAQNTNWWRK * **

*

**

**

*

*

*

*

175

**

LumP 119 KVYIKIPKDLIENISSEDHIGINGVSNSIEEVSNDIICINYPKNLSITTNLGTLE

173

RSa 119 YYDLKMDPSLTKTLVLKGSITVDGVSLTIFGLTEDTVTISLIPHTISETIFSEKT

173

*

*

*

U

**

*

ME SOME

YFP 176 KGD KVNVE LSNKINANQCW * I***** * LumP 179 TGS EVNVE TLNVSNEW ** I** * * RSe 179 IGS KVNYIE CDMIGKYMYRFLHKANENKTQ a

INE

a

*

*

*_

*

a

194

188 202

0a

B YFPI 1 MFKGIVEGIGI IEKIDIYTDLDKYAIRFPENMLNGIKKESSIMFNGCFLTVTSVNSNIVW YFP2 99 ILSARISCVASIIEIIENEDYQQMWIQIPENFTEFLIDKDYIAVDGISLTIDTIKNNQFF LumPI 1 MFKGIVQGVGI IKKISKNODTQRHGITFPKDILDSVEKDTVMLVNGCSVTVVRITGDVVY Lump2 97 ALTGNIKGVATVDNITEEEDLLKVYIKIPKDLIENISSEDHIGINGVSNSIEEVSNDI IC 1 MFTGI IEETGTIESMKKAGHAMALTIKCSKIL EDVHLGDSIAVNGICLTVTDFTKNQFT RSel RS&2 97 FVSGHVDGTAEITRIEEKSNAVYYDLKMDPSLTKTLVLKGSITVDGVSLTIFGLTEDTVT I P h D I Vm STh h h 6 IG I I

YFP1 61 FDIFEKEARKLDTFREYKVGD RVNLG TFPKFGAASGGH YFP2 160 ISLPLKIAQN TNMKWRKKGDIKVNVE LSNK ANQCW LumPI 61 FDI DQAINMFRKLEVGN KVNLE VRPGFGSLLGKG

Lu.P2 158 INY PKNLSITTNLGTLETGSIEVNVE TLNVSNEW RS&d 60 VDVMP ETVKATSLNDLTKGS KVNLE RANAANGRFGGH RSe2 157 ISLIP HTISETIFSEKTIGS KVNIE CDMIGKYMYRFLHKANENKTQ h h T h 6 KVNLE

60 159 60 157 59 156

98 194 96 189 96 202

FIG. 3. (A) Alignment of the amino acid sequence of LumP with those of YFP and RSa. Asterisks between the YFP and LumP sequences and between the LumP and RSa sequences indicate identical residues in these respective sequences. Solid squares below the RSa sequence indicate identical residues in the YFP sequence. (B) Internal homology in YFP, LumP, and RSa. The homologous N and C termini were separately aligned, and subsequently heterologous alignments of the C and N termini were performed to produce the composite alignment shown. Identical residues in the aligned protein halves are indicated by asterisks above the residues. A consensus sequence is indicated below the residues, where h indicates a hydrophobic residue.

obtained for the alignment of YFP and RSa, but no statistically significant sequence similarity was found with RS,8 (Table 1). Although many of the identical residues in LumP, YFP, and RSa are scattered throughout the protein sequences, two highly conserved sequences were found (boxed regions in Fig. 3A), which correspond to the proposed ligand binding sites in RSa (18). Paired homologous and heterologous Nand C-terminal halves of the proteins were aligned to reveal internal sequence similarities (Fig. 3B), as demonstrated previously for RSa (18). The resulting alignment parameters are collected in Table 1. These alignments were statistically significant in all combinations (4.4 5 standardized score 5 15.9) and are consistent with a gene duplication model (18). A consensus sequence was obtained for several residues: the most conserved sequence was Lys-Val-Asn-Leu-Glu, the proposed Lum binding site (18). Modeling the Lum Binding Site in LumP. Secondary structure predictions for LumP differed markedly for the three different programs, including two versions of the ChouFasman program. Consensus was achieved for only about 50% of the residues (data not shown). The most significant

Biochemistry: O'Kane et al.

Proc. Natl. Acad. Sci. USA 88 (1991)

1103

model can be readily tested by reconstitution of apoYFP with Lum derivatives (19). In support of the Lum binding site model, it should be mentioned that the putative Lum binding site near the C terminus of YFP is virtually identical to the proposed Lum binding site near the C terminus of RSa. If the Lum binding site model is substantiated by site-directed mutagenesis and ligand binding studies, then LumP and YFP would be orthologous proteins. We thank Dr. J. Wunderlich (University of Georgia) for operating the gas-phase sequenator and Dr. R. Ridge (Woods Hole Oceanographic Institute) for preparing the DNA sequencing primers. D.J.O. was a Visiting Scientist (May 1989) at Woods Hole Oceanographic Institute. This work was supported by grants from the National Institutes of Health to J.L. (GM-28139) and from the American Cancer Society (NP-640) to D.C.P.

Glu

FIG. 4. Model for proposed ligand binding site. A space-filling model of the consensus sequence Lys-Val-Asn-Leu-Glu (residues 80-84) was assembled in a,/-sheet conformation. The side chains of Lys, Asn, and Glu protrude from the surface of the protein, while the side chains of Val and Leu are oriented down into the protein (not observable from this orientation). The numbers are placed on the corresponding N and C atoms of the Lum ring, with the exception of 2', which is placed on the oxygen. The D-ribityl tail is shown in an arbitrary conformation. The dashed lines indicate proposed H bonds. A potential ionic interaction between the E-NH' of Lys-80 and C(4)=Ol- is indicated.

finding was that the two conserved regions in LumP (Fig. 3A) were assigned consensus p-sheet structures and were preceded by a region of ambiguous structure. Constraining Val-81 to Glu-84 to a p-sheet structure, preceded by Lys-80, permitted modeling of the potential interaction between Lum and apoLumP using space filling models (Fig. 4). A strong H bond between N(3)-H of Lum (35) and the side chain of Asn-82 was used to orient the ligand. Slight rotation of the Lum molecule permitted a H bond between the C(2')-OH in the ribityl tail (19) and the side chain of Glu-84. Simultaneously, this rotation permitted a potential ionic interaction between the side chain of Lys-80 and a partial negative charge on C(4)==0 of the Lum pyrimidine ring (35).

DISCUSSION It is not uncommon to find two different proteins that have evolutionarily conserved regions. The sequence similarities of cellular growth factors with oncogene products (36) and transferrin with a melanoma tumor antigen (37), are two of the many examples where structural motifs have been borrowed (appropriated) from one source and incorporated elsewhere. The only function common to LumP and RSa is the binding of Lum. Two highly conserved sequences in RSa have been proposed to be the Lum binding sites (18). Of the two conserved regions found in LumP, one is identical with that found in residues 80-84 of RSa: Lys-Val-Asn-Leu-Glu. A minimal model (Fig. 4) using this sequence can accommodate three of the four bonds inferred by ligand binding studies (19, 35). The replacement of Lys-177 in RSa by Glu-177 in LumP would substantially weaken potential Lum binding with this second conserved segment in LumP. The proposed Lum binding site model cannot be directly extrapolated to YFP at the present time since the appropriate ligand-protein interaction data is presently unavailable. However, the

1. Kurfuerst, M., Macheroux, P., Ghisla, S. & Hastings, J. W. (1987) Biochim. Biophys. Acta 904, 104-110. 2. Seliger, H. H. & Morton, R. A. (1968) in Photophysiology, ed. Giese, A. C. (Academic, New York), Vol. 4, pp. 253-314. 3. Ruby, E. G. & Nealson, K. H. (1977) Science 196, 432-434. 4. Gast, R. & Lee, J. (1978) Proc. Nati. Acad. Sci. USA 75, 833-837. 5. Lee, J. (1982) Photochem. Photobiol. 36, 689-697. 6. O'Kane, D. J., Karle, V. A. & Lee, J. (1985) Biochemistry 24, 1461-1467. 7. Leisman, G. & Nealson, K. H. (1982) in Flavins and Flavoproteins, eds. Massey, V. & Williams, C. H. (Elsevier, New York), pp. 383-386. 8. Daubner, S. C., Astorga, A. M., Leisman, G. B. & Baldwin, T. 0. (1987) Proc. Nadl. Acad. Sci. USA 84, 8912-8916. 9. Macheroux, P., Schmidt, K. U., Steinerstauch, P., Ghisla, S., Colepicolo, P., Buntic, R. & Hastings, J. W. (1987) Biochem. Biophys. Res. Commun. 146, 101-106. 10. O'Kane, D. J. & Lee, J. (1986) Methods Enzymol. 133, 149172. 11. Koka, P. & Lee, J. (1979) Proc. Nati. Acad. Sci. USA 76, 3068-3072. 12. Seliger, H. H. (1975) Photochem. Photobiol. 21, 355-361. 13. Seliger, H. H. (1987) Photochem. Photobiol. 45, 291-297. 14. Ludwig, H. C., Lottspeich, F., Henschen, A., Ladenstein, R. & Bacher, A. (1987) J. Biol. Chem. 262, 1016-1021. 15. O'Kane, D. J. & Lee, J. (1985) Biochemistry 24, 1467-1475. 16. Otto, M. K. & Bacher, A. (1981) Eur. J. Biochem. 115, 511517. 17. O'Kane, D. J. & Lee, J. (1985) Biochemistry 24, 1484-1488. 18. Schott, K., Kellermann, J., Lottspeich, F. & Bacher, A. (1990) J. Biol. Chem. 265, 4204-4209. 19. O'Kane, D. J., Lee, J., Kohnle, A. & Bacher, A. (1990) in Pteridines and Folic Acid Derivatives, eds. Curtius, H.-C., Ghisla, S. & Blau, N. (de Gruyter, Berlin), pp. 457-461. 20. Baldwin, T. O., Treat, M. L. & Daubner, S. C. (1990) Biochemistry 29, 5509-5515. 21. Inglis, A. S. (1983) Methods Enzymol. 91, 26-36. 22. Kam, J., Matthes, H. W. D., Gait, M. J. & Brenner, S. (1984) Gene 32, 217-224. 23. Kam, J., Brenner, S., Barnett, L. & Cesareni, G. (1981) Proc. Nati. Acad. Sci. USA 77, 5172-5176. 24. Wood, W. I., Gitshier, J., Lasky, L. A. & Lawn, R. M. (1985) Proc. Natl. Acad. Sci. USA 82, 1585-1588. 25. Sanger, F., Nicklen, S. & Coulson, A. R. (1977) Proc. Nati. Acad. Sci. USA 74, 5463-5467. 26. Murata, M. (1988) Comput. Chem. 12, 21-25. 27. Doolittle, R. F. (1981) Science 214, 149-159. 28. Feng, D. F., Johnson, M. S. & Doolittle, R. F. (1985) J. Mol. Evol. 21, 112-125. 29. Chou, P. & Fasman, G. D. (1974) Biochemistry 13, 222-245. 30. Chou, P. & Fasman, G. D. (1978) Adv. Enzymol. 47, 45-147. 31. Gamier, J., Osguthorpe, D. J. & Robson, B. (1978) J. Mol. Biol. 120, 97-120. 32. Shine, J. & Dalgarno, L. (1974) Proc. Natl. Acad. Sci. USA 71, 1342-1346. 33. Prasher, D. C., O'Kane, D. J., Woodward, B. & Lee, J. (1990) Nucleic Acids Res. 18, 6450.

1104

Biochemistry: O'Kane et al.

34. Miyamoto, C. M., Graham, A. F. & Meighen, E. A. (1988) Nucleic Acids Res. 16, 1551-1562. 35. Vervoort, J., O'Kane, D. J., Muller, F., Bacher, A., Strobl, G. & Lee, J. (1990) Biochemistry 29, 1823-1828. 36. Doolittle, R. F., Hunkapiller, M. W., Hood, L. E., Devare,

Proc. Nati. Acad. Sci. USA 88 (1991) S. G., Robbins, K. C., Aaronson, S. A. & Antoniades, H. M. (1983) Science 221, 275-276. 37. Rose, T. M., Plowman, G. D., Teplow, D. B., Dryer, W. J., Hellstrom, K. E. & Brown, J. P. (1986) Proc. Nati. Acad. Sci. USA 83, 1261-1265.