Wilson Layout - Nature

14 downloads 0 Views 755KB Size Report
The biosynthesis of natures most complex cofactor1 comprises. ~30 enzyme-mediated steps2–5, a requirement that approximates to about 1% of the proteins in ...
articles

The X-ray structure of a cobalamin biosynthetic enzyme, cobalt-precorrin-4 methyltransferase Heidi L. Schubert1, Keith S. Wilson1, Evelyne Raux2, Sarah C. Woodcock2 and Martin J. Warren2 Biosynthesis of the corrin ring of vitamin B12 requires the action of six S-adenosyl-L-methionine (AdoMet) dependent transmethylases, closely related in sequence. The first X-ray structure of one of these, cobaltprecorrin-4 transmethylase, CbiF, from Bacillus megaterium has been determined to a resolution of 2.4 Å. CbiF contains two α/β domains forming a trough in which S-adenosyl-L-homocysteine (AdoHcy) binds. The location of AdoHcy and a number of conserved residues, helps define the precorrin binding site. A second crystal form determined at 3.1 Å resolution highlights the flexibility of two loops around this site. CbiF employs a unique mode of AdoHcy binding and represents a new class of transmethylase.

The biosynthesis of natures most complex cofactor1 comprises ~30 enzyme-mediated steps2–5, a requirement that approximates to about 1% of the proteins in an average bacterial genome6,7. This complex pathway can be separated into three distinct parts. The CobI pathway results in the synthesis of the corrin ring component, cobinamide, from the ubiquitous tetrapyrrole primogenitor uropophyrinogen III (uro’gen III). The CobII pathway results in the synthesis of the lower axial ligand, dimethylbenzimidazole (DMB). The CobIII pathway results in the assembly of the final coenzyme from cobinamide, DMB and phosphoribosyl (Fig. 1)6. In parallel to what is observed in heme biosynthesis8, the CobI segment of cobalamin synthesis comtains independent aerobic and anaerobic branches. These CobI pathways differ in their timing of cobalt insertion and the requirement for molecular oxygen9,10. In the well studied aerobic pathway of Pseudomonas denitrificans, cobalt is chelated into hydrogenobyrinic acid a,c-diamide to generate cob(II)yrinic acid a,c-diamide, a late corrin pathway intermediate4,9. In contrast, the anaerobic pathways of Salmonella typhimurium and Bacillius megaterium chelate cobalt at a much earlier intermediate, precorrin-2 (Fig. 1)11. Although all the intermediates of the aerobic CobI pathway have been elucidated, few intermediates on the anaerobic route are known. Many of the aerobic, Cob, enzymes share a high degree of similarity with the anaerobic, Cbi, enzymes suggesting that although independent, the two pathways are broadly similar (Fig. 1). One of the unique features of cobalamin biosynthesis is the addition of eight S-adenosyl-L-methionine (AdoMet) methyl groups to the tetrapyrrole framework during corrin construction. The methyl groups are added by the action of six separate transmethylases, which are more closely related to one another than to other known sequences, suggesting that they have evolved from a common ancestral methylase. The aerobic/anerobic methylases for the relevant carbons, listed in order of pathway addition7, are CobA/CysG at C2 and C7, CobI/CbiL at C20, CobJ/CbiH at C17, CobM/CbiF at C11, the aerobic CobF at C1, and CobL/CbiE at C5 and C15. Thus the structure elucidation of any one of them not only provides insight into the mechanism of

Fig.1 The vitamin B12 biosynthetic pathway from uro’gen III, highlighting the structures of precorrin-4 and -5 in the anaerobic and aerobic pathways. The anaerobic cobalt-precorrin-4 methylase (CbiF) is most closely related to its aerobic homologue precorrin-4 methylase (CobM), but is also similar to other tetrapyrrole methylases such as CysG and CobA. Side chain notation; A = acetate, P = proprionate.

1Protein Structure Group, Department of Chemistry, University of York, Heslington, York YO1 5DD, UK. 2Department of Molecular Genetics, Institute of Ophthalmology, University College London, 11-43 Bath Street, London EC1V 9EL, UK.

Correspondence should be addressed to K.S.W. email: [email protected] nature structural biology • volume 5 number 7 • july 1998

585

articles a

Fig. 2 S-Adenosyl-L-homocysteine (AdoHcy) bound to CbiF. a, Two α/β domains form a kidney shaped protein with a proposed substrate binding trough in the center where AdoHcy is bound. AdoHcy is shown in ball-and-stick representation with carbons in yellow, oxygens in red, nitrogens in blue and sulfurs in green. b, Stereo Cα trace of the CbiF monomer with every tenth residue labeled. c, In the interface with the nearest crystallographic neighbor 31% of the molecular surface is buried to generate the functional dimer. The β-sheets of the C-terminal domains are continuous within the dimer and the active sites are oriented in opposite directions38,39.

b

c

precorrin methylation but also provides a model on which to base the structure of the other precorrin methylases. Based on its homology with the aerobic C11 transmethylase, CobM, CbiF is assumed to methylate C11 of cobalt-precorrin-4 during the anaerobic biosynthesis of cobalamin, generating precorrin-5. The structure of its substrate, cobalt-precorrin-4, has only recently been determined and is quite different from that of the aerobic precorrin-4 (ref. 12). A His-tagged form of CbiF from B. megaterium was cloned, expressed and purified and the X-ray structure of crystals grown in 1.2 M phosphate were refined to an R-factor of 20.4% at a resolution of 2.4 Å. Cleavage of the His-tag allowed crystals of a phosphate-free form to be produced, and this structure has been refined at a resolution of 3.1 Å to an R-factor of 19.3%. Structure of CbiF in phosphate CbiF is composed of two α/β domains linked by a single coil forming a kidney shaped molecule (Fig. 2a,b). Both domains contain a five stranded β-sheet flanked by four α-helices, but there is no topological similarity between the two domains. The entire structure follows a β-α repeating pattern with the single exception of a β-hairpin late in the C-terminal domain (β8-β9). The parallel β-sheet in the N-terminal domain follows a 32415 topology, a topology seen before only in the six-stranded sheet of fructose permease subunit IIb13 (324156). The C-terminal 586

domain contains a mixed sheet with 12534 topology. This strand order has been previously observed in protein tyrosine phosphatases, but with differences in direction of two of the strands. Although there is only a monomer in the asymmetric unit, both biochemical and structural evidence suggests that the enzyme exists as a dimer14. The nearest crystallographic monomer shares a substantial 31% of its surface area with the original molecule (QUANTA15). Indeed due to the relative twist of the two domains, the two C-terminal five-stranded sheets combine to generate a ten-stranded β-sheet (Fig. 2c). There are 36 direct protein–protein hydrogen bonds in the dimer interface. Of the 22 residues involved, six form hydrogen bonds through their side chain atoms (Fig. 3a,b) and are completely conserved among precorrin-4 transmethylases. Two high density features were ascribed to inorganic phosphate ions, reflecting the presence of 1.2 M Na/KPO4 in the crystallization medium. One phosphate lies within the proposed substrate binding site below the AdoHcy molecule within the Nterminal domain, forming hydrogen bonds with three waters, the amide of Thr 101 and the imidazole side chain of His 100. The fourth oxygen of this phosphate does not form any hydrogen bonds and lies 5.5 Å below the sulphur atom of AdoHcy. The second phosphate is involved in crystallographic lattice contacts, lying between the N-terminus (residues 17–20) and residues 161–162 within a loop of the C-terminal domain (β6-αF). nature structural biology • volume 5 number 7 • july 1998

articles a

b

Phosphate-free crystal form The His-tag was removed from the recombinant CbiF and the resulting protein crystallized in the absence of phosphate in a new space group with tighter crystal packing. The overall fold remains the same. Although the resolution is limited to 3.1 Å several significant changes are nevertheless evident, particularly in the conformations of two surface loops. As expected there are no bound phosphates. The r.m.s. deviation between the Cα atoms of the two forms is 0.66 Å. Conserved residues An amino acid sequence alignment between four CbiF (anaerobic cobalt-precorrin-4 transmethylase) and three CobM (aerobic precorrin-4 transmethylase) sequences highlights a number of conserved regions between ‘precorrin-4’ transmethylases (Fig. 3a)14. Thirty-eight residues are identical in all seven proteins and can be grouped by function. Nine are involved in the binding of AdoHcy including a glycine rich region GAGPG (residues 27–31), Gly 102, Asp 103, Ala 135 and Leu 184. A further six nature structural biology • volume 5 number 7 • july 1998

Fig. 3 Sequence alignment of CysG, CobM and CbiF. a, Alignment of CbiF with the aerobic precorrin-4 transmethylase, CobM, and the methylase domain of sirohaem synthase, CysG. The highlighted residues are completely conserved in all known ‘precorrin-4 transmethylase’ (CobM/CbiF) sequences and have been colored according to function: red — AdoMet/AdoHcy binding, gold — tetrapyrrole binding site, purple — dimerization, green — structural core. b, Structural representation of the functional role of conserved residues. The Cα trace is colored by conservation among the CbiF methylases. Dark blue: seven out of seven sequences conserved, white: completely unconserved38. Amino acid side chains colored as in Fig. 3a.

make up a putative precorrin binding site between residues Ala 53, Ser 55, Leu 79, Arg 98, Glu 112 and Gln 113. The dimer interface involves six additional conserved residues, Thr 37, Glu 143, Leu 144, Gln 151, Thr 156 and Arg 157. Finally, the remaining 17 lie either in the hydrophobic core or in several tight turns (Fig. 3a,b). S-adenosyl-L-homocysteine binding site AdoHcy is bound in one pocket of a large trough between the N- and C-terminal domains. The ligand lies at the carboxyl end of the parallel β-sheet in the N-terminal domain and slightly behind the last loop of the C-terminal domain, αH-β10. Residues dispersed throughout the polypeptide chain contribute both main and side chain atoms to the binding (Fig. 4a,b). AdoHcy bound to CbiF is kinked between the sulfur atom and the sugar ring to place both the homocysteine backbone and the adenosine ring into pockets of the trough, similar to a two pronged plug in a socket. The binding of AdoHcy to CbiF is quite different from those found in other AdoMet-binding proteins. In the DNA and catechol transmethylases the AdoMet is in an extended conformation with a O4S-C4S-C5S-SD torsion angle of 173º (ref. 16). In contrast, for AdoHcy bound to CbiF this angle is 82º (Fig. 4b). This distorted conformation could well assist presentation of the methyl group to the bulky substrate as it would project into the precorrin binding site. Unlike the hydrophobic packing between Phe/Tyr residues and the nucleotide ligand in the DNA methylases, CbiF contains no large hydrophobic residues forming van der Waals interactions with the adenine ring. On one side of the ring, the Cβ of Ser 132 packs against the bridge carbons at a distance of 3.4 Å. The other side is loosely packed (4.5–5.0 Å) against the methylene carbons of Gln 240. There are a total of 15 hydrogen bonds between CbiF and AdoHcy (Fig. 4a). Two support the adenosine ring; the carbonyl oxygen of conserved Pro 30 lies 2.8 Å from N6, and the amide nitrogen of partially conserved Ala 213 shares a hydrogen with N1 at a distance of 2.8 Å. The sugar hydroxyls are within hydrogen bonding distance (2.6–3.3 Å) of the amide nitrogen of conserved Leu 184, partially conserved Ala 241, and two solvent molecules, Wat 519 and Wat 509. The homocysteine portion of the ligand participates in eight hydrogen bonds. Conserved Asp 103 forms hydrogen bonds from both its amide nitrogen (2.7 Å) and carbonyl oxygen (2.8 Å) to the terminal carboxyl and amine nitrogen of homocysteine respectively. There 587

articles a

b

Fig. 4 The AdoHcy binding pocket. a, Hydrogen bonds (2.6–3.3 Å) from surrounding residues are shown. The 82º torsion angle can be seen between O4S-C4S-C5S-SD (see (b) for a detail). The red lattice represents unbiased 1σ density from an Fo - Fc map generated before AdoHcy was added to the model. Hydrogen bonds from Thr 101 O to the homocysteine amine and the loose hydrophobic packing between the adenosine ring and Gln 240 have been removed for clarity. b, Alignment of the AdoHcy and AdoMet ligands of HhaI, TaqI (yellow) and CbiF (white) highlighting the 82º torsion angle of O4S-C4SC5S-SD. c, The electrostatic surface of CbiF highlights a deep groove between the N- and C-terminal domains just below the AdoHcy binding site (blue, positive; red, negative)40.

c

are hydrogen bonds between the homocysteine moiety and the partially conserved Thr 131 and Ser 132, through the side chain hydroxyls (Ser 132 OG is 2.6 Å from AdoHcy O1 and Thr 131 OG is 2.9 Å from AdoHcy O2). Ser 132 forms a hydrogen bond though its amide nitrogen at a distance of 2.9 Å from AdoHcy O1. There are three additional hydrogen bonds to the homocysteine amine from the carbonyl of partially conserved Thr 101 (3.2 Å), the carbonyl of Met 106 (2.6 Å) and a solvent molecule (2.9 Å). CbiF only crystallized in the presence of exogenous AdoMet or AdoHcy, although the derived structure contains only AdoHcy. Previously, despite AdoMet’s reputation for being kinetically unstable, an RNA methyltransferase structure containing bound AdoMet has been determined even though the molecule was not present in the crystallization medium17. Enzymes such as this RNA methyltransferase appear to preferentially bind and stabilize AdoMet over AdoHcy. In contrast CobA, a uro’gen III methylase, binds AdoHcy 20 times more tightly than AdoMet despite the small difference in structure (CH3+)18. Calorimetric studies indicate that CbiF also preferentially binds the product AdoHcy (data not shown). The reactive methyl group can be physically accommodated in the current CbiF model, and the AdoMet complex is expected to be essentially equivalent to the AdoHcy-bound structure. Together these data suggest that CbiF may promote the breakdown of AdoMet by binding it in a conformation which favors displacement of the methyl group from the sulphonium ion. In principle, AdoHcy could be acting as a product inhibitor of the corrin biosynthetic transmethylases, preventing excessive production of vitamin B12 and depletion of the C1 pool. 588

Precorrin binding and catalysis The most likely binding site for cobaltprecorrin-4 is the large trough in the Nterminal domain (Fig. 4c). Several loops make up the walls of the trough. Loop β2-αB contains conserved residues Ala 53 and Ser 55 and, together with a second loop (β3-αC) containing conserved Leu 79, composes the left side of the trough. The right side of the trough is formed by loop β4-αD and helix αD containing conserved residues Glu 112 and Gln 113. The lower surface is lined by residues in strand β4. Several side chains (Arg 98, His 100, Gln 113, Thr 74, Thr 101, Arg 157, Gln 240, Lys 239) surround this binding site and are available for interactions with the eight carboxyl side chains of the substrate. As a substrate, cobalt-precorrin-4 is expected to bind less tightly to CbiF than tetrapyrrole-derivatized cofactors to their cognate enzymes. In addition, the latter directly ligate the central metal ion, for example, a histidine in methionine synthase ligates the cobalt ion of methylcobalamin19. In CbiF, no residue is positioned to act as a cobalt ligand. The nearest feasible cobalt-binding residue is the non-conserved His 100, which lies at the bottom of the substrate trough on strand β4. For His 100 to act as a ligand, cobalt-precorrin-4 would have to bind in a perpendicular orientation with respect to the plane of the trough. The presence of the metal ion is not essential for transmethylation, as observed for CbiF from S. typhimurium, which can methylate precorrin-3 in the absence of a central cobalt, albeit with low efficiency20. The structure of CbiF leaves few options for cobalt-precorrin-4 binding. Carbon 11 on ring C must lie within 3–4 Å of the methyl group for direct methyl transfer. Maximal enzyme–substrate contact requires rings A and B to be inserted into the trough, making salt bridges and hydrogen bonds with many surrounding residues. Residues 53–57 and 73–80 on the left-hand side of the trough have the highest temperature factors in the molecule (between 50–70 Å2), indicating structural flexibility. It is possible that a conformational change occurs on binding of substrate, the enzyme accepting the tetrapyrrole in an ‘induced fit’ mechanism. This is supported by the differences observed between the two crystal forms. In the 2.4 Å structure, phosphate is bound at the bottom of the precorrin trough. In the 3.1 Å structure, the two flexible loops (53–57 and 73–80) move to take up the position previously occunature structural biology • volume 5 number 7 • july 1998

articles pied by the phosphate and reduce the width of the trough (Fig. any covalent bond between AdoHcy and CbiF. If, as seems like5). The Asp 54 Cα moves by up to 6 Å. Its side chain moves by 9 ly, the transmethylase region of CysG has a similar structure to Å, pointing away from the proposed active site in the presence of CbiF, then the most likely residues for covalent bond formation bound phosphate, but pointing into it in the absence of phos- in CysG would be those homologous to Thr 131 and Ser 132 in phate. These movements may well be of functional significance. CbiF. Assuming that AdoMet binds in the same conformation as Several residues in CysG have been mutated to probe the AdoHcy, the position of the methyl group of AdoMet in the AdoMet and tetrapyrrole binding sites22; two within the active site would allow the transfer of the methyl group to the glycine rich AdoMet binding region (Gly 224 and Asp 227), corrin ring. It is known that the transmethylations at C2, C7 three within the putative substrate binding site (Arg 298, Asp and C12 proceed with overall inversion of symmetry in accord 303 and Arg 309) and two additional conserved charged with a direct SN2 displacement of the methyl group from residues (Asp 454 and Lys 473). Each mutant was tested for its AdoMet. To facilitate this reaction the ring carbon must be acti- ability to bind AdoMet and to rescue a cysG deletion strain of vated to function as a nucleophile. Though the cobalt ion may E. coli21. The mutagenesis results can be rationalized by alignassist by acting as an electron donor to carbon 11, it is expected ment of the CysG sequence to the CbiF structure, giving structhat the aerobic and anerobic transmethylases have identical tural insight into the effect of the disruptions. Two CysG mutants catalytic mechanisms due could not bind AdoMet to their high degree of simi— Gly224Ala and Arg298Leu. larity. The lack of highly The residue equivalent to conserved charged residues CysG-Gly 224 in CbiF is surrounding the binding Gly 29, which forms part site suggests that catalysis is of the GAGPG motif comprobably dominated by the mon to all the cobalamin lability of AdoMet and the biosynthetic transmethyproximity and orientation lases. The flexibility of this of the precorrin rather than glycine is crucial for forma general acid/base mechaing the tight turn undernism. neath the nucleotide and The alluring prospect of changing this residue to forming an enzyme–subAla would disrupt the fold. strate complex is technically The residue equivalent to challenging because the CysG-Arg 298 in CbiF is anaerobic intermediates are Arg 98, which is not in labile and oxygen sensitive2. Several attempts have been direct contact with the made to form an AdoMet binding site indienzyme–ligand complex. cating that a larger strucCrystals soaked in cyano- Fig. 5 Comparison of CbiF in the presence and absence of phosphate. Loops tural distortion must be cobalamin and cobyric acid β2-αB and β3-αC occupy alternate conformations in the two crystals forms. CbiF responsible for the lack of did effect a color change in phosphate buffer is shown in green (phosphate in red) and the phosphate- AdoMet binding in the free form in yellow. The 1σ 2Fo - Fc density of the 3.1 Å phosphate-free model is (red or pink) in the soaked shown CysG mutant. Though the in blue. In the 2.4 Å structure a phosphate lies in the precorrin binding crystal but three-dimen- site, but in it’s absence these two loops reorient to decrease the overall width of Asp303Ala mutation did sional structures derived the active site, and replace the phosphate. The Asp 54 Cα moves 6 Å and the not disrupt AdoMet bindfrom these crystals did not residue flips 180º moving the side chain by 9 Å. ing, the corresponding contain density indicative residue in CbiF, Asp 103 is of bound ligand (data not involved in direct hydroshown). The color change may reflect non-specific binding gen bonds to AdoHcy through its peptide backbone. sites of the ligand in the lattice or binding at very low occupan- Presumably, these were not disrupted by mutation of the side cy. The recently described structure of cobalt-precorrin-4 (ref. chain. 12) as well as the identification of the enzymes leading to its Two CysG mutants which bind AdoMet (Asp248Ala and formation, will hopefully assist in this difficult task. Arg309Leu) showed a reduced catalytic efficiency possibly due to interference with substrate binding. The corresponding Comparison of CysG and CbiF residues in CbiF, Asp54 and Ala109, lie on loops β2-αB and β4Of all the cobalamin biosynthetic transmethylases, E. coli CysG αD which frame the deep trough in the molecule and help is the best characterized. E. coli CysG is a trifunctional enzyme define the precorrin-4 binding site. To strengthen this hypoththat catalyses uro’gen III transmethylation at positions C2 and esis, CbiF-Asp54 is the residue which moved into the precorrin C7, NAD+ mediated dehydrogenation and ferrochelation in the binding site in the phosphate-free structure. The additional production of siroheme and vitamin B12. Experimental investi- mutants (CysG Asp227 and Lys270 which correspond to CbiF gations into the catalytic mechanism of the transmethylase Asp32 and Lys73) had no observable effect on AdoMet binding activity of CysG suggests that the enzyme binds AdoMet cova- or catalysis. This is consistent with these residues being distant lently. However, similar investigations with B. megaterium CbiF from the active site. revealed that it binds AdoMet less tightly and non-covalently21. Neither an Fo - Fc map calculated prior to addition of AdoHcy CbiF: a new class of methylase to the model (Fig. 4a) nor the refined electron density indicate Other AdoMet dependent transmethylases such as HhaI, TaqI nature structural biology • volume 5 number 7 • july 1998

589

articles a

b

Fig. 6 CbiF topology and alignments to the DNA transmethylase HhaI. a, A topology diagram of CbiF is shown with the numbering of secondary structural elements and conserved residues highlighted. The CbiF N-terminal domain is aligned with the DNA transmethylase HhaI in accordance to the output of DALI (shown as colored segments). Additional secondary structural elements which are not used in the alignments but lie in similar three-dimensional space are shown with hashed lines. The ligand binding sites sit at opposite ends of the aligned domains. b, A schematic diagram of the topological superposition between CbiF and HhaI, showing the two possible arrangements. The shaded ellipsoid represent the AdoMet/Hcy ligand, triangles represent strands and circles represent helices.

and catechol O-methyltransferase share a similar α/β tertiary fold and it was proposed that many (if not all) have a common catalytic domain structure23. Despite the fact that CbiF also has an α/β based bi-lobal architecture containing a parallel β-sheet, its overall topology and the manner in which it binds AdoHcy make it radically different (Fig. 6a). The α/β structure of the domains can be superimposed in two ways. Optimal topological superposition of the N-terminal domains using DALI24 results in the AdoMet binding sites sitting at opposite ends of the molecule (Fig. 6b); Taq1 (PDB code25 2ADM), r.m.s. deviation 3.6 Å over 69 residues: HhaI (PDB code 3MHT), r.m.s. deviation 2.7 Å over 60 residues. An alternate alignment, where the β-sheet is flipped by 180°, overlaps the ligand binding sites but the topological similarity is even lower (Fig. 6b). The most common class of protein domain structure is α/β and such domains can always be aligned to some extent. DALI24 suggests significant similarity of CbiF with 122 α/β proteins. Indeed, the best alignment is with the GTPase fragment of the signal sequence recognition protein from Thermus aquaticus, Ffh26 (PDB Code 1FFH), with an r.m.s. deviation of 3.0 Å over 79 residues, rather than with the HhaI, TaqI and catechol O-methyltransferases. Thus CbiF represents a new class of small molecule transmethylase. 590

A survey of the current protein data base25 indicates that the domain folds are predominantly α/β for enzymes which use the AdoHcy and AdoMet ligands as substrates, but α-helical in nature for proteins which use the ligands in regulatory functions27. The exception may be the activation domain of methionine synthase, a primarily helical domain that binds AdoMet. This domain functions primarily to store AdoMet for periodic reactivation of the cobalamin cofactor and also functions to present the ligand for enzymatic turnover28. Early thoughts on metabolic enzyme evolution propounded the idea that enzymes within a pathway may be structurally related. This was based on the concept that an enzyme already has a natural recognition pocket for its product and that slight modification of the enzyme through gene duplication would allow it to undertake a new reaction using its old product as substrate. For systems such as the glycolytic pathway, this has proved not to be true. However, the corrin biosynthetic transmethylases require re-evaluation of this concept. Here six members of a single pathway have indeed evolved from a single ancestor and, while retaining the same overall fold, have incorporated small changes allowing recognition of different substrates at a number of points along the pathway. nature structural biology • volume 5 number 7 • july 1998

articles Methods Purification and crystallization. CbiF was purified as described14 by affinity chromatography using a Pharmacia nickel chelating sepharose and an N-terminal Histag. The pure protein was dialyzed against 20 mM sodium acetate, pH 5.6, 100 mM NaCl and concentrated to 16–20 mg ml–1 for crystallization. The protein was incubated with 5 mM AdoMet prior to crystallization and then mixed in equal volumes with 1.0–1.2 M Na/KPO4, 0.1 M HEPES, pH 7.5, and 4% dioxane and equilibrated over the same solution. Crystals do not form in the absence of ligand, though they do form in the presence of AdoHcy. There is one molecule in the asymmetric unit giving a VM of 3.4 and 64% solvent. The crystals were shock frozen at 120 K for X-ray data collection in a cryosolvent containing the crystallization precipitants plus 30% glycerol. Both native and heavy atom derivative data were collected at the Daresbury Synchrotron Radiation Source beamline 9.6 at a wavelength of 0.91 Å. Data were indexed and scaled with DENZO and SCALEPACK (Table 1)29. The heavy atom derivative was obtained by soaking native CbiF crystals in precipitant containing 1 mM methyl-mercury chloride (MeHgCl) for one hour prior to data collection.

Table 1 Data Statistics Data set Native Space group P3121 a =,b = 80.70 Å c= 109.58 Å Resolution range 20–2.4 Å Unique reflections 16,620 Completeness (%) 99.8 (100.0) overall (final shell) Rmerge1 (final shell) 0.039 (0.122) Phasing power2 (centric / acentric) FOM3: SIRAS / solvent flattened R-factor4 (Rfree) 0.204 (0.252) Resolution range RMS deviation from ideality bonds (Å) 0.014 angles (°) 2.3 Average B-factors (Å)2 main chain 35.8 side chain 40.6 solvent 49.0 phosphate 45.3 SAH 22.7

MeHgCl P3121 80.58 Å 109.54 Å 20–2.4 Å 16,375 97.9 (91.0)

Cleaved P3221 80.04 Å 77.96 Å 20–3.1 Å 5,523 95.3 (93.7)

0.051 (0.284) 1.05 / 1.35 0.402 / 0.799 – 20–2.4 Å

0.103 (0.489)

0.193 (0.283) 20–3.1 Å 0.011 2.3 34.1 39.7

22.2

Rmerge = ΣhklΣi|I-I/ΣhklΣi(I) Phasing power = (FH/Lack of closure) 3FOM = Figure of merit: ((cosϕ)2+(sinϕ)2)1/2 4R-factor = Σ ||F r hkl obs|-k|Fcalc||/Σhkl|Fobs| 1 2

Structure determination. The structure was solved by single isomorphous replacement with anomalous scattering (SIRAS) at 2.4 Å. The mercury derivative was identified (Riso = 0.179) using the CCP4 suite of programs and refined in MLPHARE30 (Table 1). The excellent resultant phases were further enhanced by solvent flattening31 using a solvent content of 70 % — a value above the calculated solvent percentage. The high solvent content did not cause a disjointed protein surface mask and was justified by the large increase in the figure of merit from 0.402 (SIRAS) to 0.799 (flattened) (Table 1). The density modified maps enabled building 67 % of the residues with the program ‘O’32. The full model was completed after one round of refinement using X-PLOR (8–2.4 Å)33. Refinement was completed with REFMAC (resolution 20–2.4 Å) using a bulk solvent correction34. The R-factor is 20.4% (RFree 25.2% for 5% of the data). The model contains 135 waters, two phosphate ions and an AdoHcy ligand. There are 278 residues in the recombinant protein, the first twenty contain six histidines and a thrombin cleavage site (M1GSSHHHHHH SSGLVPRGSH M21). The natural protein starts at residue Met 21. Residues 13–251 are visible in the density, but the N-terminal His-tag and the C-terminal 27 residues are disordered. The side chain of one residue, Asp 123, has been assigned an alternate conformation with half site occupancy. The overall B-factor for main chain atoms is 35.8 Å2 but residues at the N-terminus (13–17), loop β2-αB (53–57), loop β3-αC (73–80) and the C-terminal residue (251) have B-factors of over 50 Å2. The high B-factors may be a result of the loose crystal packing, high solvent content and the lack of order of 39 terminal residues (14% of amino acids). The overall B-value estimated from the Wilson plot for the data is 40.8 Å2. Eighty-eight percent of residues are in the most favored regions of the Ramachandran plot35 with only one residue (Leu 56) in a generously allowed region as defined in PROCHECK36. Thrombin cleavage leads to a different crystal form. The Nterminal His-tag was cleaved off CbiF using thrombin, after

nature structural biology • volume 5 number 7 • july 1998

overnight incubation at 30 ºC in 70 mM Tris, pH 8.5, 100 mM NaCl and 2.5 mM CaCl2. The cleaved CbiF was purified by gel filtration chromatography and concentrated in the storage buffer. The cleaved protein did not crystallize in the high phosphate medium used for the full length His-tagged molecule. However, crystals did grow from 25% monomethylether polyethylene glycol (2,000 Mr), 200 mM MgCl2 and 100 mM Tris buffer, pH 8.5. Data were collected to 3.1 Å on a Rigaku Raxis imaging plate using a conventional copper rotating anode as X-ray source (λ = 1.54 Å) (Table 1). A single unambiguous molecular replacement solution was determined for the new crystal form using AMoRe37. This phosphate free structure was refined to an R-factor of 19.3% and an R free of 28.3% at 3.1 Å. Residues 18–251 are visible in the density; no waters have been modeled. Several surface side chains and the loop containing residues 161–175 have poor density. Coordinates. The coordinates for both structures have been deposited in the Brookhaven Protein Data Bank with the codes 1CBF (His-tagged form at 2.4 Å) and 2CBF (second crystal form at 3.1 Å with His-tag cleaved).

Acknowledgments We gratefully acknowledge funding from the National Institutes of Health, the Wellcome Trust and the Biotechnology and Biological Sciences Research Council. We thank the Central Laboratory of the Research Council and the staff of the Daresbury Laboratory for the provision of synchrotron radiation facilities and the BBSRC for support of such usage through the Rolling Project Mode Time allocation to York.

Received by 23 February, 1998; accepted 27 May, 1998.

591

articles 1. Stubbe, J. Binding site revealed for Nature’s most beautiful cofactor. Science 266, 1663 (1994). 2. Scott, I. How nature synthesizes vatamin B12 - a survey of the last four billion years. Angew. Chem. 32, 1223-1376 (1993). 3. Battersby, A.R. How nature builds the pigments of life - The conquest of vitamin B12. Science 264, 1551-1557 (1994). 4. Blanche, F., et al. Vitamin B12 - How the problem of biosynthesis was solved. Angew. Chem. Int. Ed. Engl. 32, 1651-1653 (1995). 5. Debussche, L., Thibaut, D., Cameron, B., Crouzet, J. & Blanche, F. Biosynthesis of the corrin macrocycle of coenzyme-B12 in Pseudomonas denitrificans. J. Bacteriol. 175, 7430-7440 (1993). 6. Lawerence, J.G. & Roth, J.R. The cobalamin (coenzyme B12) biosynthetic genes of Escherichia coli. J. Bacteriol. 177, 6371-6380 (1995). 7. Roth, J.R., Lawerence, J.G., Rubenfield, M., Kieffer-Higgins, S. & Church, G.M. Characterization of the cobalamin (vitamin B12) biosynthetic genes of Salmonella typhimurium. J. Bacteriol. 175, 3303–3316 (1993). 8. Jordan, P.M. Highlights in haem biosynthesis. Curr. Opin. Struc. Biol. 4, 902–911 (1994). 9. Blanche, F., et al. Parallels and decisive differences in vitamin B12 biosyntheses. Angew. Chem. Int. Ed. Engl. 32, 1651–1653 (1993). 10. Raux, E., et al. Salmonella typhimurium cobalamin (vitamin B12) biosynthetic genes: Functional studies in S. typhiumurium and Escherichia coli. J. Bacteriol. 178, 753–767 (1996). 11. Raux, E., Thermes, C., Heathcote, P., Rambach, A. & Warren, M.J. A role for the Salmonella typhimurium cbiK in cobalamin (vitamin B12) and siroheme biosynthesis. J. Bacteriol. 179, 3203–3212 (1997). 12. Scott, I.A., et al. Biosynthesis of vitamin B12: Factor IV, a new intermediate in the anarobic pathway. Proc. Natl. Acad. Sci. USA 93, 14316–14319 (1996). 13. Martin-Verstraete, I., Debarbouille, M., Klier, A. & Rapoport, G. Levanase operon of Bacillus subtilus includes a fructose-specific phosphotransferase system regulating the expression of the operon. J. Mol. Biol. 214, 657–669 (1990). 14. Raux, E., Woodcock, S.C., Schubert, H.L., Wilson, K.S. & Warren, M.J. Cobalamin (vitamin B12) biosynthesis; Cloning, expression and crystallisation of the Bacillus megaterium S-adenosyl-L-methionine dependent cobalt-precorrin-4 transmethylase CbiF. Euro. J. Bacteriol. in the press (1998). 15. Oldfield, T.J. Real space refinement as a tool for model building. CCP4 Study Weekend: Macromolecular refinement (Dodson, E.J., Moore, M.H., Ralph, A. & Bailey, S., eds.) 67–74 (SERC Daresbury Laboratory, Warrington, UK.;1996). 16. Malone, T., Blumenthal, R.M. & Cheng, X. Structure-guided analysis reveals nine sequence motifs conserved among DNA amino-methyl-transferases, and suggests a catalytic mechanism for these enzymes. J. Mol. Biol. 253, 618–632 (1995). 17. Hodel, A.E., Gershon, P.D., Shi, X. & Quiocho, F.A. The 1.85 Å structure of Vaccinia protein VP39: A bifunctional enzyme that participates in the modification of both mRNA ends. Cell 85, 247–256 (1996). 18. Blanche, F., Debussche, L., Thibaut, D., Crouzet, J. & Cameron, B. Purification and characterization of S-adenosyl-L-methionine: uroporphyrinogen III methyltransferase from Pseudomonas denitrificans. J. Bacteriol. 171, 4222–4231 (1989). 19. Drennan, C.L., Huang, S., Drummond, J.T., Matthews, R.G. & Ludwig, M.L. How a protein binds B12: A 3.0 Å X-ray structure of B12-binding domains of methionine synthase. Science 266, 1669–1674 (1994). 20. Roessner, C.A., et al. Expression of 9 Salmonella typhimurium enzymes for cobalamide synthesis. FEBS letters 301, 73–78 (1992).

592

21. Woodcock, S.C. & Warren, M.J. Evidence for a covalent intermediate in the Sadenosyl-L-methionine-dependent transmethylation reaction caused by sirohaem synthase. Biochem. J. 313, 415–421 (1996). 22. Woodcock, S.C., et al. The contribution of the CysGA and CysGB domains of siroheam synthase (CysG) towards cobalamin (vitamin B12) biosynthesis. Biochem. J. 330, 121–129 (1998). 23. Schluckebier, G., O’Gara, M., Saenger, W. & Cheng, X. Universal catalytic domain structure of AdoMet-dependent methyltransferases. J. Mol. Biol. 247, 16–20 (1995). 24. Holm, L. & Sander, C. Protein structure comparison by alignment of distance matrices. J. Mol. Biol. 233, 123–138 (1993). 25. Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer, E. J., Brice, M. D., Rogers, J. K., Kennard, O., Shimanouchi, T. & Tasumi, M. (1977). The protein data bank: a computer-based archival file for macromolecular structures. J. Mol. Biol. 112, 535–542. 26. Freyman, D. M., Keenan, R. J., Stoud, R. M. & Walter, P. The structure of the conserved GTPase domain of the signal recognition particle. Nature 385, 361–365 (1997). 27. Orengo, C.A., Michie, A.D., Jones, S., Jones, D.T., Swindells, M.B. & Thornton, J.M. CATH- a hierarchic classification of protein domain structures. Structure 5, 1093–1108 (1997). 28. Dixon, M.M., Huang, S., Matthews, R.G. & Ludwig, M. The structure of the Cterminal domain of methionine synthase: presenting S-adenosylmethionine for reductive methylastion of B12. Structure 4, 1263–1275 (1996). 29. Otwinowski, Z. processing of X-ray diffraction data collected in ossilation mode. Meth. Enz. 276, 307–326 (1991). 30. Otwinowski, Z. Maximum likelihood refinement of heavy atom parameters. Proceedings of the CCP4 Study Weekend (Wolf, W., Evans, P.R. & Leslie, A.G.W., eds) 80-88 (SERC Daresbury Laboratory, Warrington, UK; 1991). 31. Cowtan, K. in CCP4 & ESF-EACBM Newsletter on Protein Crystallography 34-38 (1994). 32. Jones, T.A., Zou, J.Y., Cowan, S.W. & Kjelgaard, M. Improved methods for building protein models in electron density maps and location of errors in these models. Acta Crystallogr. A 47, 110-119 (1991). 33. Brunger, A.T. X-PLOR Version 3.1: A system for X-ray Crystallography and NMR (Yale University Press, New Haven, Connecticut, USA; 1992). 34. Murshudov, G.N., Vagin, A.A. & Dodson, E.J. Refinement of macromolecular structures by the maximum likelihood method. Acta Crystallogr. D 53, 240-255 (1997). 35. Ramachandran, S. Conformations of polypeptides and proteins. Adv. Prot. Chem. 23, 283-437 (1968). 36. Laskowski, R.A., MacAuthur, M.W., Moss, D.S. & Thornton, J.M. PROCHECK - a program to check the sterochemical quality of protein structures. J. Appl. Crystallogr. 26, 283-291 (1993). 37. Navaza, J. AMORE - an automated package for molecular replacement. Acta Crystallogr. A 50, 157-163 (1994). 38. Esnouf, R.M. An extensively modified version of MolScript that includes greatly enhanced coloring capabilities. J. Mol. Graph. 15, 133-138 (1997). 39. Kraulis, P.J. MOLSCRIPT - a program to produce both detailed and schematic plots of proteins structures. J. Appl. Crystallogr. 24, 946-950 (1991). 40. Nicholls, A., Sharp, K.A. & Honig, B. Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins 11, 281296 (1991).

nature structural biology • volume 5 number 7 • july 1998