The nicking homing endonuclease I-BasI is encoded ... - BioMedSearch

1 downloads 0 Views 314KB Size Report
We thank David Edgell and Rick Bonocora for critical reading ... Chevalier,B.S. and Stoddard,B.L. (2001) Homing endonucleases: structural and functional ...
Nucleic Acids Research, 2003, Vol. 31, No. 12 3071±3077 DOI: 10.1093/nar/gkg433

The nicking homing endonuclease I-BasI is encoded by a group I intron in the DNA polymerase gene of the Bacillus thuringiensis phage Bastille Markus Landthaler and David A. Shub* University at Albany, SUNY, Department of Biological Sciences and Center for Molecular Genetics, Albany, NY, USA Received March 10, 2003; Revised and Accepted May 2, 2003

ABSTRACT Here we describe the discovery of a group I intron in the DNA polymerase gene of Bacillus thuringiensis phage Bastille. Although the intron insertion site is identical to that of the Bacillus subtilis phages SPO1 and SP82 introns, the Bastille intron differs from them substantially in primary and secondary structure. Like the SPO1 and SP82 introns, the Bastille intron encodes a nicking DNA endonuclease of the H-N-H family, I-BasI, with a cleavage site identical to that of the SPO1-encoded enzyme I-HmuI. Unlike I-HmuI, which nicks both intronminus and intron-plus DNA, I-BasI cleaves only intron-minus alleles, which is a characteristic of typical homing endonucleases. Interestingly, the C-terminal portions of these H-N-H phage endonucleases contain a conserved sequence motif, the intron-encoded endonuclease repeat motif (IENR1) that also has been found in endonucleases of the GIY-YIG family, and which likely comprises a small DNA-binding module with a globular bbaab fold, suggestive of module shuf¯ing between different homing endonuclease families.

INTRODUCTION Group I introns are found in protein- and RNA-coding genes of a diverse set of organisms. In eukaryotes, group I introns exist in nuclear rRNA genes of fungi and ciliates and in organellar genes of plants, fungi and ¯agellates (1). They have also been found in cyanobacteria, proteobacteria, Gram-positive bacteria and bacteriophages (2). A large number of group I introns encode site-speci®c DNA endonucleases (1). These intron-encoded endonucleases (homing endonucleases) function primarily in a biological process whereby a group I intron is inserted into an intronless version of a gene, at the same position it already

occupies in the intron containing homolog, by a duplicative and unidirectional transfer. Homing endonucleases belong to four families based on the presence of well-conserved sequence motifs that are denoted LAGLIDADG, His-Cys box, GIY-YIG and H-N-H, respectively (3). Most homing endonucleases bind DNA in a sequence-speci®c fashion, recognizing large stretches of the intron-minus version of the gene in which they are inserted. In general, the recognition site comprises sequences of both exons and the endonucleases generate a double-strand break close to the intron insertion site (4). Two of the families, the GIY-YIG and H-N-H homing endonucleases, have been found in group I introns, and also as freestanding open-reading frames, in bacteria and their phages (2,5,6). The coliphage T4 td intron-encoded endonuclease I-TevI is the best-studied member of the GIY-YIG family. It is a bipartite enzyme with distinct catalytic and DNA-binding domains (7±9) that recognizes a 37-bp region in a sequence tolerant manner (10). Members of the H-N-H family are less well understood, as they possess biochemical activities usually not associated with other homing endonucleases. For instance, I-TevIII of phage RB3 makes a double-strand break similar to other endonucleases but is the only homing endonuclease that generates a 5¢ overhang instead of the typical 3¢ extension (11). The H-N-H endonucleases I-HmuI and I-HmuII, encoded by group I introns in the DNA polymerase genes of Bacillus phages SPO1 and SP82, respectively, are distinct from other intron-encoded endonucleases in that they cleave intron-plus as well as intron-minus alleles, and cut only one strand of their DNA substrate (12). Another example of a nicking H-N-H endonuclease, I-TwoI, is encoded by a group I intron in the ribonucleotide reductase gene of staphylococcal phage Twort (13). In this work we describe a group I intron in the DNA polymerase gene of Bacillus thuringiensis phage Bastille. Like the SPO1 and SP82 introns, the Bastille intron encodes a nicking DNA endonuclease and its cleavage site is identical to that of the SPO1-encoded enzyme I-HmuI. However, in contrast to I-HmuI, I-BasI has DNA substrate speci®city for intron-minus alleles, which is characteristic of a typical homing endonuclease. Sequence comparisons of the

*To whom correspondence should be addressed. Tel: +1 518 442 4324; Fax: +1 518 442 4767; Email: [email protected] Present address: Markus Landthaler, Rockefeller University, New York, NY, USA

Nucleic Acids Research, Vol. 31 No. 12 ã Oxford University Press 2003; all rights reserved

3072

Nucleic Acids Research, 2003, Vol. 31, No. 12

C-terminal portion of phage H-N-H endonucleases against the protein database indicate the presence of a conserved sequence motif that has also been found in endonucleases of the GIYYIG family, likely forming a small DNA-binding module with a globular bbaab fold.

MATERIALS AND METHODS Bacterial and bacteriophage strains Phage Bastille (HER211) and its host B.thuringiensis (HER1211) were obtained from the Felix d'Herelle Reference Center. Escherichia coli XL-1 Blue (Stratagene) was used as the recipient strain for high-frequency plasmid electroporation. Escherichia coli BL21(DE3) pLysE was used as the bacterial host for protein expression. Plasmids The plasmid (pBET) for over-expression of I-BasI was generated by PCR ampli®cation of Bastille DNA using primers BIorf-U2 and BIorf-D. The PCR product was digested with NdeI and ligated into the NdeI site of pAii17 (14). Expression and preparation of I-BasI protein extracts Cells were grown in LB supplemented with ampicillin (50 mg/ml) at 37°C to A600 = 0.6 and expression was induced by addition of IPTG to a ®nal concentration of 1 mM. Incubation was continued at 37°C for 3 h. Cells were harvested by centrifugation at 6000 g for 20 min, and resuspended in ice-cold 50 mM Tris±HCl (pH 7.2), 1 mM EDTA, 1 mM PMSF, 2 mg/ml leupeptin, 200 mM KCl at a concentration of 6 ml/g cells. The resuspended cells were sonicated to complete lysis and centrifuged at 12 000 g for 1 h. The protein was present in the pellet fraction. The pellet was washed with chilled deionized H2O, resuspended in 1 ml/g (cells) 6 M guanidine hydrochloride, renatured by dialyzing twice against 1003 vol of 50 mM potassium phosphate (pH 7.2), 100 mM NaCl, 1 mM DTT and stored in 10% glycerol at ±80°C. Endonuclease assay with extracts from cells expressing I-BasI PCR-generated DNA fragments, radio-labeled at their 5¢ termini, were used as substrates. The intron-minus Bastille DNA substrate was generated by reverse transcription of Bastille RNA isolated 10 min post-infection with primer S2373 and subsequent PCR ampli®cation with primers S2325 and S2373. The intron-plus Bastille DNA substrate was PCRgenerated by ampli®cation of Bastille DNA with primers S2327 and S2373. The intron-minus SPO1 DNA substrate was generated by PCR ampli®cation of plasmid pHGO1DI with primers 1 and 2 of Goodrich-Blair and Shub (12). Labeled PCR products were puri®ed with QIAquick spin PCR puri®cation. Reactions were performed with 5 3 103 to 3 3 104 c.p.m. of labeled DNA substrate in a 5 ml reaction volume in 50 mM Tris±HCl (pH 7.9), 10 mM MgCl2, 100 mM NaCl, 1mM DTT with 2 ml protein extract. The reactions were allowed to proceed for 10 min at 30°C and terminated by the addition of 2 ml 95% formamide, 20 mM EDTA, 0.05% bromphenol blue and 0.05% xylene cyanol. Reaction products

were separated on a 5% denaturing polyacrylamide gel and visualized by autoradiography. Isolation of Bastille RNA Bacillus thuringiensis cells were grown at 37°C to an OD540 of 0.4 (~5 3 107 cells/ml) and infected with ~5 phages/cell. Cells (25 ml) were harvested by centrifugation at 5000 g for 10 min at 4°C and washed twice with 10 mM Tris±HCl (pH 7.5 at 4°C), 100 mg/ml chloramphenicol. Cells were resuspended in 100 ml 10 mM Tris±HCl (pH 8.0), 1 mM EDTA, 50 mg/ml lysozyme (Sigma) and incubated for 5 min at room temperature. RNA was isolated with the RNeasy Kit (QIAGEN) as explained in the manufacture's protocol for isolation of total RNA from bacteria. In vitro labeling of RNA with [a-32P]GTP was according to Reinhold-Hurek and Shub (15). Isolation of Bastille phage DNA Bacillus thuringiensis cells were grown to an OD540 of ~0.4, infected at a multiplicity of ~0.1 phage/cell, and incubation was continued until lysis was complete. Phages were precipitated in 10% PEG 8000 and phage particles were further puri®ed by centrifugation through a CsCl step gradient according to the bacteriophage l puri®cation protocol in Sambrook et al. (16). After removal of CsCl by dialysis against 50 mM Tris±HCl (pH 8.0) and 1 mM EDTA, the phage DNA was extracted with phenol. Bastille DNA southern hybridization Restriction enzyme digests of Bastille DNA were separated on a 1% agarose gel and vacuum blotted onto a positively charged nylon membrane (Hybond-N+, Amersham) by alkaline transfer. The Bastille DNA blot was probed with radiolabeled phage Twort orf142 intron DNA (17). The probe was generated by random primed labeling of a 992-bp PCR product, derived from ampli®cation of Twort DNA using primers 4 and 5 from Landthaler and Shub (17). Hybridization was carried out in 63 SSC, 53 Denhardt's solution, 0.1% SDS, 200 mg/ml herring sperm DNA. Four washes at 50°C in 23 SSC, 0.1% SDS were followed by a single wash in 0.2 3 SSC, 0.1% SDS at 50°C. Mapping of I-BasI cleavage sites Intron-minus Bastille and SPO1 DNA substrates were generated as described above with 32P-labeled bottom strand primers. Labeled PCR products were incubated with 1/10 vol of I-BasI protein extract in 50 mM Tris±HCl (pH 7.9), 10 mM MgCl2, 100 mM NaCl, 1 mM DTT, phenol extracted and separated on a denaturing polyacrylamide gel. Sequence ladder was generated by a cycle sequencing reaction with the end-labeled primer used to make the DNA substrate. Oligonucleotides BIorf-U2, 5¢-TGGAGGTACCATATGTTTCAAGAAGAG (868±894); BIorf-D, 5¢-TGTGGTGCATATGTTATTTTTTACTTAC (complement: 1432±1459); S2325, 5¢-GAGAATTACCCAGAACA (510±526); S2327, 5¢-TACACCACATTACTAGA (1451±1467); S2373, 5¢-CTAATGCCATTA-

Nucleic Acids Research, 2003, Vol. 31, No. 12

3073

Figure 2. DNA polymerase I amino acid sequence alignment. DNA polymerase I amino acid sequence alignment was generated with CLUSTAL W1.8. (34). GenBank accession nos as follows: Bastille (AAO93094), SPO1 (P30314), SP82 (S53691), E.coli (P00582), B.subtilis (Z99188). Conserved residues are shaded, using BOXSHADE with default parameters (http:// www.ch.embnet.org/software/BOX_form.html). Numbers in brackets indicate numbers of residues in front of and following the aligned sequence. For SP82 and Bastille DNA polymerase genes, only partial sequence information was available. Downward-pointing arrow indicates intron insertion site.

Figure 1. GTP-labeling of Bastille RNA. Five micrograms of RNA, isolated from B.thuringiensis before (0) and at times indicated after infection with Bastille, was deproteinized and incubated with [a-32P]GTP under selfsplicing conditions. Labeled RNA was separated by electrophoresis on a 4% acrylamide±8 M urea gel. Sizes of a 32P-labeled HaeIII digest of FX174 DNA (M) are indicated to the left. Size of the GTP-labeled RNA is indicated on the right.

CACGGGAC (complement: 1668±1687). The underlined sequences indicate introduced NdeI restriction sites. RESULTS The discovery of a self-splicing group I intron in the thymidylate synthase gene of Bacillus phage b22 (18) prompted us to screen other, morphologically identical phages for the presence of group I introns by in vitro labeling of RNA from infected cells with [a-32P]GTP. Since the 3¢ OH of GTP is the nucleophile in the ®rst transesteri®cation step of group I splicing, the GTP will be added at the 5¢ end of the excised introns. Using this method, we have described previously the presence of at least ®ve group I introns in the Staphylococcus phage Twort (13,17). Another phage included in this screen was Bastille, infecting B.thuringiensis. GTP-labeling of RNA, isolated from B.thuringiensis 10 or 20 min after infection with Bastille, resulted in an end-labeled RNA product of ~850 nt (Fig. 1) indicating the presence of a self-splicing group I intron in this phage genome. In an attempt to clone the Bastille intron, a DNA fragment containing the three Twort orf142 introns (17) was used as a probe for Southern hybridization of restriction enzyme digested Bastille DNA. Two hybridizing fragments (0.7 and 1.0 kb), identi®ed from NsiI-digested DNA, were isolated from a genomic plasmid mini-library and sequenced on both strands. The junction at the NsiI site was con®rmed by sequencing this region on the I-BasI expression plasmid, pBET. Analysis of the sequence (GenBank accession no. AY256517) revealed the presence of a group I intron 853 nt in length (Supplementary Material, Fig. S1). A BLAST search of the protein database (19) showed that the interrupted coding sequence was highly similar to DNA polymerase genes of Bacillus phages SPO1 and SP82, which are also interrupted by

self-splicing group I introns. Amino acid sequence alignment of the DNA polymerases of these phages further showed that the introns in Bastille, SPO1 and SP82 are inserted at the homologous site (Fig. 2). Amino acids at the site of intron insertion, corresponding to a region of the palm subdomain that interacts with the template strand (20), are highly conserved in enzymes related to DNA polymerase I. Intron sequence and structure Figure 3 shows the putative secondary structures of the Bastille intron, which follows the group I consensus pairings P1 through P9, with dispensable pairing P2 missing (21). Based on the presence of extra nucleotides between P3 and P7, and conserved nucleotides in and adjacent to P7, the intron belongs to subgroup IA (22) within which most bacteriophage members comprise a distinct subgroup, IA2. Despite being inserted at a homologous site, the Bastille intron has signi®cant differences in its secondary structures from the intron in SPO1 (and SP82). The Bastille intron lacks secondary structures P3.1 and P3.2, non-conserved structure elements that are inserted in the SPO1/SP82 introns between P3 and P4. The Bastille intron has only a single pairing element (P7.1) between P7 and P3, characteristic of the IA1 subfamily, rather than two elements (P7.1 and P7.2) that are typical of most phage introns (22). Finally, the Bastille intron has, typical for phage introns, a 7 bp P9 (terminating in a tetraloop) and a stem±loop P9.1, whereas the SPO1/SP82 introns have a single P9 with an elongated base-paired stem. The Bastille intron encodes a nicking DNA endonuclease Like most other phage introns, the Bastille intron has extra nucleotides inserted into the terminal loop of a conserved secondary structure element. In this case an open reading frame (ORF) of 188 codons (Supplementary Material, Fig. S1, residues 880±1443), which is preceded by a ribosome-binding site, is inserted into the loop of P8. This is the same location where the SPO1 and SP82 introns encode endonucleases I-HmuI and I-HmuII, respectively. In BLASTP searches (19) the Bastille intronic ORF was most similar to I-HmuI (E = 3e±24), I-HmuII (E = 4e±17) and several other phageencoded proteins with the conserved H-N-H homing endonuclease motif. An alignment of the amino acid sequences is shown in Figure 4. A reverse position-speci®c BLAST (19) further indicated similarity to two conserved domains in the NCBI Conserved Domain Database (CDD) (23): the H-N-H motif (pfam01844) and, interestingly, to a domain described as intron-encoded

3074

Nucleic Acids Research, 2003, Vol. 31, No. 12

Figure 3. Secondary structures of Bastille (left) and SPO1 (right) intron. Exon sequences are in lower case and intron sequences in uppercase letters. Arrows indicate 5¢ and 3¢ splice sites (ss). Conserved structural elements P1 through P9 are shown. Complementarity to the 3¢ end of 16S ribosomal RNA, and start and stop codons of intronic ORFs are boxed. Numbering for Bastille intron corresponds to Figure S1 (GenBank accession no. AY256517) and for SPO1 intron to accession no. M37686.

nuclease repeat motif (IENR1: smart00497) possibly with a helix±turn±helix motif (HTH). In addition to H-N-H proteins, several GIY-YIG endonucleases have been identi®ed as containing the conserved IENR1 domain. To further investigate the distribution of the conserved domain, protein database searches were employed with a motif search tool (MAST) (24) using an alignment comprising the IENR1 of phage H-N-H proteins shown in Figure 4. Among proteins containing the IENR1 motif, a weak similarity (E = 1.8) was identi®ed to the C-terminal portion of I-BmoI, a group I intron-encoded homing endonuclease of the GIY-YIG family encoded by the thymidylate synthase intron in Bacillus mojavensis (25). To address the question of whether the Bastille intron ORF encodes a functional endonuclease, the protein was expressed in E.coli for DNA endonuclease assays. Intron-encoded DNA endonucleases, in general, recognize and cleave the intronless version of their cognate genes. Thus, a DNA fragment, differentially end-labeled on either strand and containing part of the intron-minus Bastille DNA polymerase gene around the intron insertion site, was used as a substrate. Figure 5 shows that protein from cells expressing the intronic ORF has endonucleolytic activity, cleaving the intronless Bastille DNA polymerase gene. No activity was detected using protein derived from cells harboring the expression plasmid without insert. Based on this activity the intron-encoded ORF was designated I-BasI. Like I-HmuI and I-HmuII, I-BasI is a strand-speci®c endonuclease that introduces a nick in the template strand of the target DNA. No cleavage of the coding

strand was detected under conditions in which the template strand was completely turned into product. Precise mapping of the cleavage sites on the Bastille and SPO1 substrates place the sites on the template strand 3 nt downstream of the intron insertion site (Fig. 6). Interestingly, I-BasI and I-HmuI have identical cleavage sites, with I-HmuI also nicking the coding strand 3 nt downstream of the intron insertion site [this places the cleavage site 1 nt further 3¢ than reported previously in (12)]. The fact that both enzymes cleave at the same position suggests that they bind homologous stretches of their respective DNA polymerase genes. Since I-HmuI also cleaves the intron-plus gene (12), a labeled DNA fragment containing the intron±exon II boundary was incubated with I-BasI (Fig. 5). However, unlike I-HmuI, I-BasI only cleaved the intronless gene. In a separate experiment (not shown) I-BasI was unable to cleave internally labeled PCR products that spanned the 5¢ and 3¢ splice sites, respectively. Surprisingly, while unable to nick the introncontaining Bastille sequence, I-BasI was able to cleave the intron-minus SPO1 gene despite considerable differences in the nucleotide sequence surrounding the cleavage site (Fig. 6C). DISCUSSION This work describes a self-splicing group I intron and intronic DNA endonuclease in the genome of B.thuringiensis phage Bastille. The intron is inserted in the DNA polymerase gene at

Nucleic Acids Research, 2003, Vol. 31, No. 12

3075

Figure 4. Amino acid sequence alignment of I-BasI with related phage H-N-H DNA endonucleases and C-termini of GIY-YIG endonucleases. Alignment was generated with CLUSTAL W1.8 (34) using amino acid sequences with GenBank accession nos as follows: I-BasI (AAO93095) I-HmuI (AAA64536), I-HmuII (AAA56884), phage phi31 (orf6: CAC04164), phage bL170 (e11: AAC27227), phage r1t (orf41: AAB18716), phage b22 (L31962:935±1126), I-BmoI (AAK09365), I-TevI (P13299). IENR1 consensus sequence was obtained from the NCBI conserved domain database, CDD (23). Conserved residues are shaded, using BOXSHADE with default parameters (http://www.ch.embnet.org/software/BOX_form.html). Numbers in brackets indicate numbers of residues preceding the aligned sequence. Letters below the alignment indicate HNH motif. Protein secondary structure assignment of I-TevI sequence (TEV2D) and predicted structure of IENR1 (IENR2D) (35) are shown (H = helix, E = strand).

Figure 5. DNA endonuclease assay. Differentially end-labeled PCR products (either top or bottom strand, indicated by asterisk) were generated by ampli®cation of the intron-minus and intron-plus Bastille, and the intronminus SPO1 DNA polymerase gene as indicated. The labeled PCR products were incubated with protein from cells expressing the Bastille intronic ORF (2), from cells harboring pAii17 (1) and no added protein (0). Reaction products were separated in denaturing polyacrylamide gels.

the homologous position as the introns in SPO1, SP82 and Fe, all closely related phages infecting B.subtilis and containing the modi®ed base hydroxymethyluracil (HMU) in place of thymine in their DNA (26). Bastille is morphologically distinct from the HMU phages (27) and its DNA is not extensively modi®ed (our unpublished observation), suggesting that these Bacillus phages are unrelated. Like other group I introns (2,13), the insertion site is located in a highly

conserved region of functional importance within the coding sequence (Fig. 2). All four DNA polymerase introns have insertions in the loop of pairing element P8 that encode DNA endonucleases of the H-N-H family. The Bastille DNA polymerase and its intron-encoded endonuclease, I-BasI, are most similar to the SPO1 DNA polymerase and intron endonuclease I-HmuI, respectively, as indicated by being each other's best match in the protein database. However, the structurally conserved portions of the introns differ substantially in sequence and secondary structure, arguing that the intron sequences do not share a recent common ancestor. Interestingly, the closest matches to the Bastille intron in a BLASTN search are the introns in Staphylococcus phage Twort (13,17). A BLASTP search of the protein database using the I-BasI amino acid sequence revealed, in addition to I-HmuI, similarity to the intron-encoded endonuclease I-HmuII and several additional intronic and free-standing phage ORFs. The similarity among these proteins is most pronounced in the N-terminal part, whereas the C-termini have diverged considerably. The high conservation of the N-terminal region, which includes the H-N-H motif (which has been suggested to form the active site of these endonucleases), led to the proposal that these phage endonucleases have a two-domain structure (13,25) analogous to that of I-TevI, with N-terminal catalytic and C-terminal DNA-binding domains (7). Database searches with the I-BasI sequence further indicated a similarity of the proposed C-terminal DNA-binding domain to a 53 amino acid consensus sequence described as IENR1 with a potential HTH (23). Surprisingly, more rigorous database searches identi®ed a sequence similarity of the IENR1 motif to the C-terminal DNA-binding domain of

3076

Nucleic Acids Research, 2003, Vol. 31, No. 12

Figure 6. I-BasI and I-HmuI cleavage site mapping. PCR products, end-labeled on the template strand, were generated from Bastille and SPO1 intronless DNA polymerase genes. DNA was incubated with (A) protein from cells expressing I-BasI (2), from cells harboring pAii17 (1), and no added protein (0) or (B) puri®ed I-HmuI (H) (M. Landthaler and D. A. Shub, unpublished). Sequencing ladders generated with the same end-labeled primers used in the PCR reactions are resolved next to the cleavage products. (C) Nucleotide sequence alignments near the cleavage site. Template strands of intronless Bastille and SPO1 DNA polymerase genes (E1-E2) and intron-plus Bastille gene (I-E2) are aligned around the 3¢ splice site. Identities to the Bastille intronless sequence are indicated by dots. Intron insertion site (IIS) and cleavage site (CS) are indicated by arrows.

B.mojavensis thyA intronic DNA endonuclease I-BmoI (25), which is highly similar to the structurally well-characterized phage endonuclease I-TevI (8,9). Sequence comparison of the C-termini of I-BmoI and I-TevI with the phage H-N-H endonucleases suggests that the IENR1 motif would make up a HTH subdomain with globular bbaab fold, which is consistent with the secondary structure predication of the IENR1 consensus sequence (Fig. 4). In the crystal structure of the I-TevI C-terminus bound to DNA, the bbaab fold represents a small globular DNA-binding module. Unlike traditional HTH domains, the HTH DNA-binding module of I-TevI contacts the major groove via phosphate backbone and hydrophobic interactions rather than base-speci®c contacts (8). The globular fold and the potential versatility in contacting DNA sequences in a site-speci®c and sequence-tolerant manner suggest that the IENR1 motif is well suited be fused to a catalytic cartridge like the H-N-H motif. Similarity between proteins of the GIY-YIG and H-N-H families has been noted previously. The phage T4 protein MobD, a putative mobile element with an H-N-H motif at its N-terminus (28), can be aligned with another T4 protein, SegF, a member of the GIY-YIG family, over about 90 amino acids at their C-termini (29). The relatedness of H-N-H and GIY-YIG endonucleases in distinct sequence elements indicates the modular nature of these proteins. The shuf¯ing of modules of protein structure, for example in the DNA-binding domain, could provide a mode for homing endonucleases to acquire new DNA substrate speci®cities. While shuf¯ing of DNA-binding modules is one potential pathway to generating new DNA recognition speci®city, the phage intron-encoded endonucleases I-BasI, I-HmuI and

I-HmuII provide an interesting example of how highly similar endonucleases have obtained distinct DNA substrate speci®cities without obvious domain shuf¯ing. I-HmuI and I-HmuII have been shown previously to generate a nick in both introncontaining and intronless DNA, an activity and substrate speci®city until then unseen for homing endonucleases, with each enzyme showing a preference for the DNA polymerase gene of the other phage (12). I-HmuII cleaves a site unrelated to the I-HmuI and I-BasI cleavage sites, 52 bp downstream of the intron insertion site, in a region of sequence heterogeneity with SPO1 DNA (12). In contrast to the unusual properties of these enzymes, I-BasI has the DNA substrate speci®city of typical homing endonucleases. It cleaves the intronless Bastille DNA polymerase gene and, rather than cleaving the cognate intron-containing gene, it nicks the intronless SPO1 sequence despite nine differences in the 25-bp binding region (Fig. 4; M. Landthaler and D. A. Shub, unpublished data). Tolerance for cleaving homing sites with sequence variations has been described for several intron-encoded endonucleases: I-TevI (10), I-CreI (30,31), I-SceI (32) and I-PpoI (31,33). The differences in DNA target speci®cities of I-BasI, I-HmuI and I-HmuII are likely a consequence of diverse evolutionary constraints. Goodrich-Blair and Shub (12) proposed that the unusual speci®cities of I-HmuI and I-HmuII might have arisen due to selective pressure to promote intron replacement rather than simple homing, due perhaps to a dearth of unoccupied homing sites. The existence of the closely related I-BasI, with properties expected for a typical homing endonuclease, may provide an interesting model to study how highly similar intronic endonucleases evolved to recognize and cleave distinct DNA substrates.

Nucleic Acids Research, 2003, Vol. 31, No. 12

SUPPLEMENTARY MATERIAL Supplementary Material is available at NAR Online. ACKNOWLEDGEMENTS We thank David Edgell and Rick Bonocora for critical reading of the manuscript. This work was supported by grant GM37746 from the National Institutes of Health. REFERENCES 1. Lambowitz,A.M. and Belfort,M. (1993) Introns as mobile genetic elements. Annu. Rev. Biochem., 62, 587±622. 2. Edgell,D.R., Belfort,M. and Shub,D.A. (2000) Barriers to intron promiscuity in bacteria. J. Bacteriol., 182, 5281±5289. 3. Belfort,M. and Roberts,R.J. (1997) Homing endonucleases: keeping the house in order. Nucleic Acids Res., 25, 3379±3388. 4. Chevalier,B.S. and Stoddard,B.L. (2001) Homing endonucleases: structural and functional insight into the catalysts of intron/intein mobility. Nucleic Acids Res., 29, 3757±3774. 5. Dalgaard,J.Z., Klar,A.J., Moser,M.J., Holley,W.R., Chatterjee,A. and Mian,I.S. (1997) Statistical modeling and analysis of the LAGLIDADG family of site-speci®c endonucleases and identi®cation of an intein that encodes a site-speci®c endonuclease of the HNH family. Nucleic Acids Res., 25, 4626±4638. 6. Kowalski,J.C., Belfort,M., Stapleton,M.A., Holpert,M., Dansereau,J.T., Pietrokovski,S., Baxter,S.M. and Derbyshire,V. (1999) Con®guration of the catalytic GIY-YIG domain of intron endonuclease I-TevI: coincidence of computational and molecular ®ndings. Nucleic Acids Res., 27, 2115±2125. 7. Derbyshire,V., Kowalski,J.C., Dansereau,J.T., Hauer,C.R. and Belfort,M. (1997) Two-domain structure of the td intron-encoded endonuclease I-TevI correlates with the two-domain con®guration of the homing site. J. Mol. Biol., 265, 494±506. 8. VanRoey,P., Waddling,C.A., Fox,K.M., Belfort,M. and Derbyshire,V. (2001) Intertwined structure of the DNA-binding domain of intron endonuclease I-TevI with its substrate. EMBO J., 20, 3631±3677. 9. VanRoey,P., Meehan,L., Kowalski,J.C., Belfort,M. and Derbyshire V. (2002) Catalytic domain structure and hypothesis for function of GIY-YIG intron endonuclease I-TevI. Nature Struct. Biol., 9, 806±811. 10. Bryk,M., Quirk,S.M., Mueller,J.E., Loizos,N., Lawrence,C. and Belfort,M. (1993) The td intron endonuclease I-TevI makes extensive sequence-tolerant contacts across the minor groove of its DNA target. EMBO J., 12, 4040±4041. 11. Eddy,S.R. and Gold,L. (1991) The phage T4 nrdB intron: a deletion mutant of a version found in the wild. Genes Dev., 5, 1032±1041. 12. Goodrich-Blair,H. and Shub,D.A. (1996) Beyond homing: competition between intron endonucleases confers a selective advantage on ¯anking genetic markers. Cell, 84, 211±221. 13. Landthaler,M., Begley,U., Lau,N.C. and Shub,D.A. (2002) Two selfsplicing group I introns in the ribonucleotide reductase large subunit gene of Staphylococcus aureus phage Twort. Nucleic Acids Res., 30, 1935±1943. 14. Perler,F.B., Comb,D.G., Jack,W.E., Moran,L.S., Qiang,B., Kucera,R.B., Benner,J., Slatko,B.E., Nwankwo,P.O., Hepstead,S.K., Carlow,C.K.S. and Jannasch,H. (1992) Intervening sequences in an archaea DNA polymerase gene. Proc. Natl Acad. Sci. USA, 89, 5577±5581. 15. Reinhold-Hurek,B. and Shub,D.A. (1993) Experimental approaches for detecting self-splicing group I introns. In Zimmer,E.A., White,T.J., Cann,R.L. and Wilson,A.C. (eds), Methods Enzymology. Academic Press, San Diego, CA, Vol. 224, pp. 491±502. 16. Sambrook,J., Fritsch,E.F. and Maniatis,T. (1989) Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.

3077

17. Landthaler,M. and Shub,D.A. (1999) Unexpected abundance of selfsplicing introns in the genome of bacteriophage Twort: introns in multiple genes, a single gene with three introns and exon skipping by group I ribozymes. Proc. Natl Acad. Sci. USA, 96, 7005±7010. 18. Bechhofer,D.H., Hue,K.K. and Shub,D.A. (1994) An intron in the thymidylate synthase gene of Bacillus bacteriophage b22: evidence for independent evolution of a gene, its group I intron and the intron open reading frame. Proc. Natl Acad. Sci. USA, 91, 11669±11673. 19. Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389±3402. 20. Patel,P.H., Suzuki,M., Adman,E., Shinkai,A. and Loeb,L.A. (2001) Prokaryotic DNA polymerase I: evolution, structure and `base ¯ipping' mechanism for nucleotide selection. J. Mol. Biol., 308, 823±837. 21. Cech,T.R., Damberger,S.H. and Gutell,R.R. (1994) Representation of the secondary and tertiary structure of group I introns. Nature Struct. Biol., 1, 273±280. 22. Michel,F. and Westhof,E. (1990) Modeling of the three-dimensional architecture of group I catalytic introns based on comparative sequence analysis. J. Mol. Biol., 216, 585±610. 23. Marchler-Bauer,A., Anderson,J.B., DeWeese-Scott,C., Fedorova,N.D., Geer,L.Y., He,S., Hurwitz,D.I., Jackson,J.D., Jacobs,A.R., Lanczycki,C.J., Liebert,C.A., Liu,C., Madej,T., Marchler,G.H., Mazumder,R., Nikolskaya,A.N., Panchenko,A.R., Rao,B.S., Shoemaker,B.A., Simonyan,V., Song,J.S., Thiessen,P.A., Vasudevan,S., Wang,Y., Yamashita,R.A., Yin,J.J. and Bryant,S.H. (2003) CDD: a curated Entrez database of conserved domain alignments. Nucleic Acids Res., 31, 381±387. 24. Bailey,T.L. and Gribskov,M. (1998) Combining evidence using p-values: application to sequence homology searches. Bioinformatics, 14, 48±54. 25. Edgell,D.R. and Shub,D.A. (2001) Related homing endonucleases I-BmoI and I-TevI use different strategies to cleave homologous recognition sites. Proc. Natl Acad. Sci. USA, 98, 7898±7903. 26. Goodrich-Blair,H. and Shub,D.A. (1994) The DNA polymerase genes of several HMU-bacteriophages have similar group I introns with highly divergent open reading frames. Nucleic Acids Res., 22, 3715±3721. 27. Ackermann,H.-W. and DuBow,M.S. (1987) Viruses of prokaryotes. Volume II: Natural Groups of Bacteriophages. CRC Press. Boca Raton, FL. 28. Kutter,E., Gachechiladze,K., Poglazov,A., Marusich,E., Shneider,M., Aronsson,P., Napuli,A., Porter,D. and Mesyanzhinov,V. (1995) Evolution of T4-related phages. Virus Genes, 11, 285±297. 29. Mosig,G., Colowick,N.E. and Pietz,B.C. (1998) Several new bacteriophage T4 genes, mapped by sequencing deletion endpoints between genes 56 (dCTPase) and dda (a DNA-dependent ATPase-helicase) modulate transcription. Gene, 223, 143±155. 30. Durrenberger,F. and Rochaix,J.D. (1993) Characterization of the cleavage site and the recognition sequence of the I-CreI DNA endonuclease encoded by the chloroplast ribosomal intron of Chlamydomonas reinhardtii. Mol. Gen. Genet., 236, 409±414. 31. Argast,G.M., Stephens,K.M., Emond,M.J. and Monnat,R.J.,Jr (1998) I-PpoI and I-CreI homing site sequence degeneracy determined by random mutagenesis and sequential in vitro enrichment. J. Mol. Biol., 280, 345±353. 32. Colleaux,L., D'Auriol,L., Galibert,F. and Dujon,B. (1988) Recognition and cleavage site of the intron-encoded omega transposase. Proc. Natl Acad. Sci. USA, 85, 6022±6026. 33. Wittmayer,P.K., McKenzie,J.L. and Raines,R.T. (1998) Degenerate DNA recognition by I-PpoI endonuclease. Gene, 206, 11±21. 34. Thompson,J.D., Higgins,D.G. and Gibson,T.J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-speci®c gap penalties and weight matrix choice. Nucleic Acids Res., 22, 4673±4680. 35. Jones,D.T. (1999) Protein secondary structure prediction based on position-speci®c scoring matrices. J. Mol. Biol., 292, 195±202.