Requirements for double-strand cleavage by

2 downloads 0 Views 655KB Size Report
Type II enzymes able to specifically cleave more than ... the linkage of a zinc finger DNA-binding domain to the DNA- ... DNA cleavage is directed to sites recognized by the binding ... by a peptide linker to the nuclease domain at the C-terminus. .... fingers and separated from them by a 15 amino acid linker [(G4S)3; G, ...
© 2000 Oxford University Press

Nucleic Acids Research, 2000, Vol. 28, No. 17 3361–3369

Requirements for double-strand cleavage by chimeric restriction enzymes with zinc finger DNA-recognition domains Jeff Smith1,2, Marina Bibikova3, Frank G. Whitby3, A. R. Reddy1,4, Srinivasan Chandrasegaran1 and Dana Carroll3,* 1Department

of Environmental Health Sciences, The Johns Hopkins University School of Hygiene and Public Health, 615 North Wolfe Street, Baltimore, MD 21205, USA, 2Department of Biophysics and Biophysical Chemistry, The Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA, 3Department of Biochemistry, University of Utah School of Medicine, 50 North Medical Drive, Salt Lake City, UT 84132, USA and 4Department of Biological Sciences, Pondicherry University, Pondicherry 605014, India Received April 12, 2000; Revised June 20, 2000; Accepted July 3, 2000

ABSTRACT This study concerns chimeric restriction enzymes that are hybrids between a zinc finger DNA-binding domain and the non-specific DNA-cleavage domain from the natural restriction enzyme FokI. Because of the flexibility of DNA recognition by zinc fingers, these enzymes are potential tools for cleaving DNA at arbitrarily selected sequences. Efficient doublestrand cleavage by the chimeric nucleases requires two binding sites in close proximity. When cuts were mapped on the DNA strands, it was found that they occur in pairs separated by ∼4 bp with a 5′ overhang, as for native FokI. Furthermore, amino acid changes in the dimer interface of the cleavage domain abolished activity. These results reflect a requirement for dimerization of the cleavage domain. The dependence of cleavage efficiency on the distance between two inverted binding sites was determined and both upper and lower limits were defined. Two different zinc finger combinations binding to non-identical sites also supported specific cleavage. Molecular modeling was employed to gain insight into the precise location of the cut sites. These results define requirements for effective targets of chimeric nucleases and will guide the design of novel specificities for directed DNA cleavage in vitro and in vivo. INTRODUCTION Site-specific endonucleases are powerful tools for the manipulation of DNA sequences. Naturally occurring restriction enzymes have played a central role in the cloning and mapping of genes since their original isolation roughly three decades ago. Type II enzymes able to specifically cleave more than 140 different sites are now available commercially (1). Despite

their diversity, these endonucleases have limited utility because their recognition sites are rather short (8 bp or less) and their specificity is not easily altered. The class of homing nucleases or meganucleases (2) recognizes longer sequences (∼20 bp), but shares the limitation of having rigid sequence requirements. For some applications it would be desirable to have enzymes that recognize specific sequences with good discrimination, but also have the ability to be manipulated to bind new, arbitrarily selected sequences. We have developed a class of chimeric nucleases based on the linkage of a zinc finger DNA-binding domain to the DNAcleavage domain (FN) from the Type IIs restriction enzyme FokI (3–6). Similar hybrids combine DNA-binding domains from natural and synthetic transcription factors to this or other non-specific cleavage domains (7–10). In these constructs, DNA cleavage is directed to sites recognized by the binding domains, thus proving the feasibility of manipulating the target specificity. The Cys2His2 zinc fingers are of particular interest in this regard. Each individual finger contacts primarily three consecutive base pairs of DNA in a modular fashion (11,12; Fig. 1). By manipulating the number of fingers and the nature of critical amino acid residues that contact DNA directly, binding domains with novel specificities can be evolved and selected (13–21). In principle, a very broad range of DNA sequences can serve as specific recognition targets for zinc finger proteins. Chimeric nucleases with several different specificities based on zinc finger recognition have already been constructed and characterized (3,6,8,9). In the present work, we examine in more detail the requirements for efficient DNA cleavage by two of these zinc finger–FN chimeras. Both Zif-QQR-FN (QQR) (22) and Zif-∆QNK-FN (QNK) (6) have the general structure diagrammed in Figure 1, with the three finger DNA-binding domain at the N-terminus connected by a peptide linker to the nuclease domain at the C-terminus. Because of differences in several key residues in the middle finger, they recognize related, but distinct, sites: 5′-GGG GAA

*To whom correspondence should be addressed. Tel: +1 801 581 5977; Fax: +1 801 581 7959; Email: [email protected] The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors

3362 Nucleic Acids Research, 2000, Vol. 28, No. 17

head-to-head inverted repeats 10 bp apart; pQD10 carries direct repeats separated by 10 bp. Enzymes

Figure 1. Schematic diagram of the chimeric nuclease. The three zinc fingers are shown in ribbon representation (12). The residues that provide the primary specificity-determining interactions with the DNA bases, at positions –1, 3 and 6 relative to the start of the α-helix of each finger, are indicated next to the bases they contact. The FokI endonuclease domain is C-terminal to the zinc fingers and separated from them by a 15 amino acid linker [(G4S)3; G, glycine, S, serine]. The specific sequences illustrated are for QQR (22,23).

GAA for QQR (22,23) and 5′-GGG GCG GAA for QNK (6,24). Studies of these two chimeric nucleases were pursued in parallel, using similar, but not identical, procedures and substrates. Both enzymes require two copies of the recognition site in close proximity to effect efficient double-strand cleavage, reflecting a requirement for dimerization of the cleavage domain. While natural FokI (25) must also dimerize, the need for neighboring paired binding sites is unique to the chimeric nucleases. A consequence of this requirement is that the chimeric enzymes have very high target specificity, since two designated 9 bp sequences must be bound. The results presented here will guide the future design of chimeric nucleases directed to specific targets. One potential application of these enzymes is site-specific cleavage of DNA in vivo with the goal of evaluating double-strand break repair or stimulating targeted recombination. The latter prospect is addressed in a separate study (Bibikova et al., manuscript in preparation). MATERIALS AND METHODS DNA substrates For most QNK substrates, double-stranded oligodeoxyribonucleotides having the recognition site 5′-GGG GCG GAA were cloned into the BamHI site of pUC18. The parent plasmid for QQR substrates was pRW4 (26) and oligodeoxyribonucleotides containing the recognition site 5′-GGG GAA GAA were inserted into the unique XhoI site. QNK plasmids were transformed into Escherichia coli DH5α, QQR plasmids into E.coli XL-1 Blue and DNAs from individual colonies were characterized by DNA sequence analysis. The exact sequences of the inserts are given in Figure 5. Plasmid DNAs were purified using Qiagen columns (Qiagen, Valencia, CA). The names of the plasmids reflect the structure of the inserts. pKS has a single QNK site, while pKT8 has two sites 8 bp apart in tail-to-tail inverted orientation, i.e. the G ends of the recognition site face each other. Similarly, pKT14, pKT28 and pKT48 carry inverted sites with the indicated separations. pQS carries a single copy of the QQR site; the pQTn series has tail-to-tail inverted sites with n bp between them; pQH10 has

Zif-QQR-FN and Zif-∆QNK-FN were prepared as previously described (6). Briefly, the coding sequence for the chimeric nuclease was cloned into pET15b, so that it carries a His6 tag at its N-terminus, and was propagated in BL21 (DE3) cells that overproduce E.coli DNA ligase from a pACYC184 derivative. Expression of the nuclease was initiated by addition of IPTG to 0.7 mM to cells growing at 22°C in LB medium plus ampicillin, tetracycline and 100 µM ZnCl2. Harvested cells were disrupted by sonication or by passage twice through a French press and the clarified extract was passed over a His-bind column. The enzyme was eluted with 0.4 M imidazole and purified further on a heparin–Sepharose, then a gel filtration column (S-100 HR or Superdex-75). It was stored at –80°C in 40% glycerol (10% glycerol in some cases), 20 mM Tris, pH 7.9, 10 mM β-mercaptoethanol, 100 µM ZnCl2. In vitro reactions were typically performed in 20 µl containing 10 mM Tris, pH 8.5, 50 mM NaCl, 1 mM DTT, 100 µM ZnCl2, 50 µg/ml BSA, 100 µg/ml tRNA. QQR reactions used 50 ng of substrate DNA that had been linearized by PvuII digestion; QNK substrates were linearized with ScaI (Fig. 6) or with SspI (Figs 3a and 7) and used at 100 ng/reaction. Enzyme was added, followed by preincubation for 30 min at room temperature. MgCl2 was added to a final concentration of 10 mM and incubation was continued for 1 h at room temperature. Cleavage was monitored by electrophoresis in 1% agarose gels. Dimer interface mutants of QNK were constructed by PCR with primers that incorporate the desired mutations. For D483A, the forward primer was d(CAATTGGCCAAGCAGCTGAAATGCAACGATATGTCGAAGAAAATCAAACACG); the corresponding primer for R485D was d(CAATTGGCCAAGCAGATGAAATGCAAGATTATGTCGAAGAAAATCAAACACG). Each was used with the reverse primer d(TAGGATCCTCATTAAAAGTTTATCTCGCCGTTATT) from the C-terminus of the FN coding sequence. The resulting PCR products were cleaved with MscI, gel purified and used as reverse primers in a second round of PCR that included d(GAAGATCTTCGATCCCGCGAAATTAA), from the vector N-terminal to the QNK coding sequence, as the forward primer. The final PCR products were digested with NdeI and BamHI, gel purified and cloned into pET15b. Identities of individual clones were confirmed by DNA sequencing and the proteins were expressed and purified as described above. Mapping cut sites To label one strand of the QNK substrates, plasmids were cut at one end of the pUC18 polylinker with EcoRI. The DNA was treated with calf intestinal alkaline phosphatase (New England BioLabs, Beverly, MA) and then with T4 polynucleotide kinase (Boehringer Mannheim, Indianapolis, IN) and [γ-32P]ATP (Amersham Life Sciences, Arlington Heights, IL). After heat inactivation of the kinase, the DNA was digested with HindIII, which cuts at the other end of the polylinker. The resulting small fragment was purified from a 3% low melting point agarose gel. To label the other strand, the order of HindIII and EcoRI digests was reversed. One quarter of the labeled sample was subjected to each of the Maxam–Gilbert G and G+A reactions (27) and the remaining two quarters were used in reactions with

Nucleic Acids Research, 2000, Vol. 28, No. 17 3363

or without QNK. After phenol/chloroform extraction and ethanol precipitation, the products were separated by electrophoresis in a 10% polyacrylamide sequencing gel. For QQR reactions, a DNA fragment of ∼400 bp was amplified by PCR from each of the plasmid substrates. The primers, d(CAGGTAGATGACGACCATCAGG) and d(GGAATGGACGATATCCCGCAAG), correspond to sequences in pRW4 flanking the insertion site. This fragment was gel purified with a Qiaex II gel extraction kit (Qiagen) and used as a template for a second PCR, using the internal primers d(GGTTGGCATGGATTGTAGGCG) and d(TGTTAGATTTCATACACGGTGCC), to generate a fragment of ∼200 bp. To label each strand separately, one of the latter primers was treated with T4 polynucleotide kinase and [γ-32P]ATP. The PCR products were purified with a QIAquick PCR purification kit (Qiagen), then treated with QQR. Denatured reaction products were separated by electrophoresis in 6% polyacrylamide sequencing gels. Dideoxy sequencing reactions were performed for each substrate using a dsDNA cycle sequencing kit (Gibco BRL, Gaithersburg, MD) and the same labeled primers as for PCR. These were run on the same gels in lanes immediately adjacent to the nuclease cleavage products. Molecular modeling Coordinates for the zinc fingers were taken from the co-crystal structure of the DNA-binding domain of QNK bound to DNA (Protein Database accession no. 1MEY) (24). Coordinates for the FokI cleavage domain dimer include residues 387–579 from the structure of the protein alone (Protein Database accession no. 2FOK) (28). The cleavage domain dimer was docked to B-form DNA by eye, using the published model (28) as a guide. The zinc finger domains were placed at various positions along the DNA by aligning backbone phosphates from consecutive residues in the co-crystal structure (24) with corresponding phosphates on B-form DNA. All alignments were performed with the graphics program O and figures were prepared from these data using MolScript. RESULTS Binding site requirements for double-strand cleavage Previous work with several zinc finger chimeric nucleases, including QQR, showed that they make cuts primarily to the left side of their recognition sequences, as depicted in Figure 1 (22). This was the expected location, given the orientation of the zinc fingers on the DNA and the structure of the chimeric protein. Some cleavage occurred on both strands, but the mapping of the sites was performed on denatured DNA and the efficiency of double-strand cleavage was not determined (22). Therefore, we focused our attention on the production of double-strand breaks. We constructed and analyzed a collection of specifically designed plasmid substrates with variable numbers and orientations of the canonical recognition site for QQR. These were linearized and treated with QQR. At enzyme:substrate ratios close to 1, in order to achieve double-strand cleavage it was necessary to have at least two copies of the target oligonucleotide (Fig. 2a). A single copy of the recognition sequence (pQS) did not support cleavage. With 10 bp between paired sites, both tail-to-tail inverted repeats (pQT10) and direct repeats

Figure 2. Substrate specificity of QQR. (a) Substrates with various binding site dispositions. pQS has a single copy of the canonical recognition site, indicated by the arrow. The remaining DNAs have two sites in tail-to-tail inverted (pQT10), head-to-head inverted (pQH10) and direct repeat (pQD10) orientations. The vector is pRW4 without an insert. Samples of DNA (0.7 nM, corresponding to 1.4 nM recognition sites in the cases with paired sites) were incubated without enzyme (–) or with QQR at 1.0 (a), 1.5 (b) or 3.0 nM (c). The locations of the 5.6 kb linear substrate DNAs and the 3.6 and 2.0 kb fragments expected from cleavage at the target site are indicated to the right of the figure. (b) Cleavage at higher enzyme concentrations. The substrates were PCR fragments from a single site plasmid (QS) and one with two inverted sites (QT16); DNA concentration ∼20 nM. QQR concentrations were 0 (–), 3.5 (1), 7 (2), 17.5 (3), 35 (4) and 50 nM (5). The locations of the substrate (S) and expected product (P) bands are indicated to the right. Faster migrating fragments are from cleavage at secondary sites. The Stds lane in each panel contains linear size standards.

(pQD10) were effectively cut, while head-to-head inverted repeats (pQH10) were cleaved much less efficiently. Observed double-strand breaks mapped to the expected sites (Fig. 2a and data not shown). At substantially higher enzyme:substrate ratios, both QQR and QNK made targeted cuts in DNAs that carried a single copy of the recognition site. In the comparison shown in Figure 2b, QS carries a single site, while QT16 has two in tailto-tail orientation 16 bp apart. The DNAs were PCR fragments of ∼200 bp, identical to those used for mapping reactions (see below). QT16 was cleaved at all QQR concentrations tested and cleavage was essentially complete at an approximately 1:1 ratio of enzyme to sites (lane 4). In contrast, QS required ∼10-fold more enzyme to achieve comparable levels of cleavage (QS in lane 4 versus QT16 in lane 1), and this corresponds to a 20-fold higher ratio of enzyme to recognition sites. At the highest enzyme concentration used (lane 5), other sites began to be cleaved, perhaps reflecting binding of QQR to more distantly related sequences. Influence of target site separation on cleavage efficiency Paired inverted sites in the tail-to-tail orientation showed efficient double-strand cleavage when the sites were 10 or 16 bp apart (Fig. 2). To determine the upper and lower limits on distances that would allow cleavage, we examined a series of substrates for each chimeric nuclease in which variable amounts of essentially random DNA sequence were inserted between the recognition sites. For QNK, separations of 8, 14,

3364 Nucleic Acids Research, 2000, Vol. 28, No. 17

Figure 3. Dependence of cleavage on separation between inverted sites. (a) QNK substrates with a single copy of the recognition site (pKS) or with 8, 14, 28 and 48 bp separations between tail-to-tail inverted sites, as indicated. Reactions contained 2.5 nM DNA and 10 nM enzyme. (b) QQR substrates with the separations indicated between inverted sites. Reactions and designations as in Figure 2a, with 0.7 nM DNA and either no enzyme (–) or QQR at 1.0 (a), 1.6 (b) or 5.0 nM (d). The band between 5.6 and 3.6 kb in the samples labeled 4 is an artifact of this particular plasmid preparation.

28 and 48 bp were tested (Fig. 3a). Under conditions that did not support cleavage at a single site (pKS), the 8, 14 and 28 bp separations allowed double-strand cleavage, while the 48 bp separation did not. For QQR, we tested a larger collection of different separations, as shown in Figure 3b. When the paired sites were 4 bp apart, very little double-strand cleavage was observed and that only at the highest enzyme input. A separation of 6 bp led to good cleavage with QQR and this remained true for all distances tested up to 35 bp. The substrate with a separation of 40 bp, however, was essentially not cleaved. Thus, the upper limit for effective site separations is between 35 and 40 bp, in agreement with the observations for QNK. Mapping cut sites on DNA strands In principle, the requirement for two binding sites to achieve double-strand cleavage could reflect either of two underlying phenomena. (i) Each individual bound chimeric molecule might make an independent single-strand cut close to its binding site and two such cuts in proximity would be necessary to produce a double-strand break. In this view the upper limit on the distance between effective paired sites would be determined by the stability of the DNA duplex between nicks on the two strands. (ii) The cleavage domain of the chimeric nuclease might have to dimerize in order to act as an effective nuclease and when it does concerted breaks would be made in the two

strands. Natural FokI dimerizes to cleave DNA (25) and it is reasonable to suspect that the cleavage domains in the context of the chimeric nuclease would do the same. In this case, the upper limit on effective site separation would reflect the maximum extension achievable by the peptide linker between the binding and cleavage domains. We distinguished these possibilities by mapping the cut sites for QNK and QQR on a wide range of substrates at single nucleotide resolution. Model (i) predicts that single-strand cuts will be produced in fixed positions relative to each recognition site and that their locations will move apart as the distance between the sites is increased. Model (ii) predicts that cuts in the two strands will always be paired and, like FokI, they should produce a 5′ overhang of 4 bp. To map the cuts made by QNK, a fragment carrying the paired sites, the intervening sequence and ∼50 bp of pUC18 was labeled on either end with 32P as described in Materials and Methods. After digestion with the enzyme, products were compared to G and G+A sequencing reactions of the same fragment (Fig. 4a). Maxam–Gilbert chemistry removes the designated base and leaves the preceding 3′-phosphate, while the chimeric nuclease leaves a 3′-hydroxyl. Both these properties increase the mobility of the Maxam–Gilbert fragments, so the alignment with the QNK products was adjusted by about 1.5 bands to identify the exact site of cleavage. With 8 bp between QNK sites (KT8), strong cuts were seen on both strands between the sites: a single cut on one strand, a strong and a secondary cut on the other strand (Fig. 4a). When mapped on the DNA sequence, the major cuts are 4 bp apart and result in a 5′ overhang (Fig. 5a). With KT14, five or six relatively strong cuts were made on each strand (Figs 4a and 5a). When mapped they overlap considerably, but may be interpreted as three clusters of paired cuts staggered by ∼4 bp, one near the middle of the intervening sequence and one near each end. With KT28, again a single strong cleavage site was seen on both strands near the middle of the space between binding sites with a 4 bp 5′ stagger. Minor bands were visible in all cases, indicating that the cut locations were not rigidly determined. Cuts were also mapped on a DNA carrying a single recognition site for QNK (KS), using a high concentration of enzyme (Fig. 4a). Two groups of cuts were seen on each strand, similar to results obtained previously with other zinc finger chimeras (9). These cuts assemble on the DNA sequence into two clusters centered ∼4 and 13 bp from the 5′-end of the binding site (Fig. 5a). There is a general 5′ stagger in each cluster, although the distances between the cuts are not restricted to 4 bp. Similar locations were seen with KT48 at high QNK concentrations (Fig. 5a). Also shown in Figure 5a are mapped cuts in two QNK substrates that were determined independently by the procedure described for QQR below. The major cuts in KT8′ are farther apart than seen in KT8, perhaps due to sequence preference of the FokI cleavage domain (see Discussion). KT12 showed paired, centered, strong cuts separated by a 4 bp 5′ stagger, plus one minor cut reflecting a 3 bp stagger. To map cuts on QQR substrates, a PCR fragment of ∼200 bp from each plasmid was labeled on either end and reaction products were analyzed in parallel with dideoxy sequencing reactions on the same DNAs, using the same primers. At moderate QQR concentration essentially no nicks were

Nucleic Acids Research, 2000, Vol. 28, No. 17 3365

Figure 4. Mapping cut sites on DNA strands. (a) QNK substrates. Lanes G and G+A contain Maxam–Gilbert sequencing reaction products of the end-labeled DNAs. Adjacent lanes have the same fragments (∼40 nM) treated without enzyme (–) or with QNK at 10 (+) or 100 nM (++). (b) QQR substrates. In each set, samples of a DNA fragment, labeled on one strand and treated with the nuclease, were run beside dideoxy sequencing reactions (GATC) initiated from a primer labeled at exactly the same position. DNAs (4 nM) were incubated without enzyme (–) or with QQR at 1.0 (a) or 3.0 nM (c). In both panels arrows indicate the positions and orientations of the 9 bp recognition sites.

produced in the vicinity of a single copy of the recognition site (not shown). At the same concentration single strong cuts were made in both strands between sites separated by 12 bp (QT12). When mapped onto the DNA sequence, these strong cuts were precisely 4 bp apart with a 5′ stagger (Fig. 5b). With QT16, cuts were made near one end or the other of the intervening sequence and the most prominent cuts occurred in pairs with a 4 bp 5′ stagger. QT30 provided the only case in which the strongest cuts were clearly separated by