Protein dynamics determined by backbone ... - Semantic Scholar

4 downloads 0 Views 267KB Size Report
decomposed into independent, collective motions using a molecular dynamics simulations were performed with a quasi-harmonic method (Teeter and Case, ...
Protein Engineering vol.10 no.4 pp.373–380, 1997

Protein dynamics determined by backbone conformation and atom packing

Junichi Higo1,2 and Hideaki Umeyama Department of Physical Chemistry, School of Pharmaceutical Sciences, Kitasato University, 5-9-1 Shirokane, Minato-ku, Tokyo, 108 Japan 1Present

address: Biomolecular Engineering Research Institute (BERI), 6-2-3, Furuedai, Suita, Osaka, 565 Japan 2To

whom correspondence should be addressed

To study the factors determining the collective motions in thermal, conformational fluctuations of a globular protein, molecular dynamics simulations were performed with a backbone model and an atomic-level model. In the backbone model, only the Cα atoms were explicitly treated with two types of pairwise interactions assigned between the Cα atoms: atom-packing interactions to take into account the effect of tight atom packing in the protein interior and chain-restoring interactions to maintain the backbone around the native conformation. A quasi-harmonic method was used to decompose the overall fluctuations into independent, collective modes. The modes assigned to large conformational fluctuations showed a good correlation between the backbone and atomic-level models. From this study, it was suggested that the collective modes were motions in which a protein fluctuates, keeping the tertiary structure around the native one and avoiding backbone overlap and, hence, rough aspects of the collective modes can be derived without details of the atomic interactions. The backbone model is useful in obtaining the overall backbone motions of a protein without heavy simulations, even though the simulation starts from a poorly determined conformation of experiments and in sampling main chain conformations, from which the side chain conformations may be predicted. Keywords: backbone fluctuation/backbone model/collective mode/molecular dynamics simulation/quasi-harmonic simulation

Introduction A protein has a native conformation and an instantaneous conformation fluctuates thermally around this native conformation. In the protein interior, the constituent atoms are tightly packed and the interactions between the atoms are complicated. The study of the thermal fluctuations is important not only for explaining the biological function of a protein, but also for understanding the complicated phenomena of condensed matter of finite size. In computational work, the thermal, conformational fluctuations of a globular protein were decomposed into collective motions (Brooks and Karplus, 1983; Go¯ et al., 1983; Levitt et al., 1985; Teeter and Case, 1990; Horiuchi and Go¯, 1991; Ichiye and Karplus, 1991; Kitao et al., 1991; Amadei et al., 1993). In normal mode analysis, the fluctuations are expressed by a linear combination of normal modes (Brooks and Karplus, 1983; Go¯ et al., 1983; Levitt et al., 1985). There, the lower © Oxford University Press

the vibrational frequency of a normal mode, the larger the contribution of the mode to the overall fluctuation and a small number of normal modes with low frequencies dominated the overall fluctuation. A collective-variable Monte Carlo simulation was developed using the normal modes as independent variables to generate the conformational changes, where a larger step size was taken along a lower-frequency normal mode (Noguti and Go¯, 1985). The conformational fluctuations obtained from a molecular dynamics (MD) simulation were decomposed into independent, collective motions using a quasi-harmonic method (Teeter and Case, 1990; Kitao et al., 1991; Amadei et al., 1993), where a collective motion was characterized by an eigen vector and an eigen value and the larger the eigen value, the larger the conformational fluctuation along the eigen vector. Low-frequency vibrational modes obtained from the normal mode analysis were well correlated with the large eigen value collective modes obtained from the quasi-harmonic method (Teeter and Case, 1990; Ichiye and Karplus, 1991; Kitao et al., 1991; Horiuchi and Go¯, 1991). A backbone model was introduced to study the low-frequency vibrational modes of a protein (Levy et al., 1984). There, the Cα atoms in the backbone were explicitly treated neglecting the other atoms and harmonic, pairwise interactions were applied to the nearest, second nearest and third nearest pairs of Cα atoms along the backbone. The remaining pairs further than the third nearest were not included in the potential energy, even though these Cα atoms were close to each other in the tertiary structure. The force constants in the harmonic, pairwise interactions were modulated and were expected to yield the same vibrational frequencies as observed in an atomic-level MD simulation. However, the low frequencies obtained from the backbone model were considerably lower than those from the atomic-level MD simulation. The introduction of a simplified model is useful in deriving general and important aspects from complicated phenomena in systems. The model of Levy et al. (1984) made a protein tractable without heavy calculations during the middle of 1980. Although the computer power is increasing, this technique is still useful for large systems in obtaining the rough aspects of protein dynamics from a simple calculation and in understanding the essential factors determining the thermal motions in protein. The study of backbone dynamics is, nowadays, also useful in generating a number of main chain conformations without a heavy simulation, even though the simulation starts from a poorly determined conformation of experiments. Main chain sampling is also useful in the prediction of side chain conformations (Kono and Doi, 1994; Tanimura et al., 1994; Va´squez, 1996). To extract the essential factors from the complicated motions in a protein, we used the backbone model. Instead of neglecting the exact, atom-pairwise interactions, we introduced two types of interactions between the Cα atoms: chain-restoring and atompacking interactions. The former are essentially equivalent to the interaction used by Levy et al. (1984) but the latter were 373

J.Higo and H.Umeyama

not included in their model. We performed MD simulations on both the backbone model and on an atomic-level model. Using a quasi-harmonic method, the thermal, conformational fluctuations from the simulation trajectories were decomposed into independent, collective modes. We compared the collective modes obtained from the backbone model with those obtained from the atomic-level model. Adopting the collective modes as independent variables, the conformational space was divided into subspaces. The subspaces obtained from the backbone model were projected onto those obtained from the atomic-level model. We discussed the role of the backbone conformation and atom packing in the protein dynamics. Materials and methods Backbone model In the backbone model, a protein was represented by the Cα atoms while neglecting the other atoms. Then, a Cα atom corresponds to an amino acid residue and the Cα atoms were connected by bonds forming the protein backbone. We introduced two types of interactions between the Cα atoms: chain-restoring and atom-packing interactions. The former was to maintain the protein backbone around the native conformation and the latter was to take into account the effect of tightly packed atoms in the protein interior. The chainrestoring energy was given as Ecr 5 E1–2 1 E1–3 1 E1–4

(1)

E1–2 5 c1–2 Σ (ri, i11 – r0i, i11)2

(2)

E1–3 5 c1–3 Σ (ri, i12 – r0i, i12)2

(3)

E1–4 5 c1–4 Σ (ri, i13 – r0i, i13)2

(4)

where

and

Here, ri,j is the distance between the ith and jth Cα atoms and r0i,j is the corresponding distance in the native conformation. The summations in Equations 2–4 were taken over all of the nearest, second nearest and third nearest pairs along the backbone respectively. The contribution coefficients, c1–2, c1–3 and c1–4, were set at 300.0 kJ/mol.A2 (5 71.7 kcal/mol.A2; a discussion on this parameter setting is given later). The terms E1–2, E1–3 and E1–4 are introduced to keep the bond lengths, bond angles and torsional angles around those of the native backbone respectively. The function E1–4 has two local minima for each torsional angle: if there is a torsional angle of θ in the native conformation (here, θ Þ 180°), the E1–4 has another local minimum at 2θ. The reason for using Equation 4 instead of a single-minimum function is the simplicity of the function form. For the atom-packing interaction, we used a 24-12 type function. We used one of two kinds of functions for this interaction alternatively, tight packing (Etp) or loose packing (Elp), as follows: Etp 5 ctp Σ [(r0i,j/ri,j)24 – (r0i,j/ri,j)12]

(5)

Elp 5 clp Σ [(3.7/ri,j)24 – (3.7/ri,j)12]

(6)

The summation for Equation 5 was taken over the pairs of Cα atoms for those where the distances were shorter than 10 Å in the native conformation, where the pairs taken in the summation of Equations 2–4 were excluded in the sum of Equation 5, 374

even though the distances for those pairs were ,10 Å. Once the pairs were chosen, these pairs were not altered during the MD simulation (i.e. fixing pair-list). The summation for Equation 6 was taken over all the pairs of Cα atoms (also excluding the pairs taken for Equations 2–4). The contribution coefficients, ctp and clp, were set to 3.0 kJ/mol (5 0.717 kcal/ mol; a discussion on this parameter setting is given later). We also tested a 12-6 type function for the atom-packing interaction. However, we do not mention the results from the 126 type function, because they were similar to those from the 24-12 type function. In Equation 5, when the distance between two Cα atoms becomes shorter than that of the native conformation, a repulsive force acts. Thus, Equation 5 expresses the effect of the tightly packed atoms in the protein interior. In Equation 6, a diameter of 3.7 Å (i.e. approximately the minimum distance between adjacent Cα atoms in proteins) was assigned to the Cα atoms. Therefore, Equation 6 represents a looser atom packing than Equation 5. We classified the backbone model into five submodels (submodels 1–5), where the energy terms were included differently. The switching of terms is given in Table I. Submodel 4 (E 5 E1–2 1 E1–3 1 E1–4) is essentially equivalent to the model by Levy et al. (1984). Molecular dynamics simulation We performed MD simulations on the backbone and atomiclevel models. All the atoms were explicitly treated in the atomic-level model. The molecule for the simulation was a small globular protein BPTI (Marquart et al., 1983; PDB entry 4PTI) of 58 amino acid residues taken from the Brookhaven Protein Data Bank (Bernstein et al., 1977). The atomic-level simulation [100 ps for each simulation, time step 0.5 fs, constant-temperature simulation by Berendsen et al. (1984), in vacuo, no cut-off operation for the atom pairwise, non-bonded interactions and AMBER energy parameters (Weiner et al., 1984)] was performed at 50 and 300 K, using the program package APRICOT (Yoneda and Umeyama, 1992). We call these simulations the ‘50 K simulation’ and ‘300 K simulation’ respectively and call the conformational fluctuations obtained from the trajectories the ‘50 K fluctuation’ and ‘300 K fluctuation’ respectively. Each simulation followed an equilibrium simulation (10 ps), which started from an energy-minimized conformation near the X-ray conformation. Snapshots were stored every 100 steps of simulation. The Cα atoms picked up from the trajectory were least-squares fitted to those in the X-ray conformation and the average conformation of the backbone was calculated at each temperature. The reason for using only the Cα atoms in the fitting is to produce a correspondence with the treatment of the backbone model. The 50 K simulation was performed to obtain a trajectory in which the conformation approximately fluctuated in a harmonic potential surface and the 300 K simulation to obtain a trajectory at room temperature (the effect of the in vacuum treatment is discussed later). The backbone MD simulations were started from the 50 K average conformation obtained from the atomic-level simulation. The mass of the Cα atoms was set at 100 g/mol, which means that all the amino acid residues had the same mass (a discussion is given later). The time step of the simulation was set to 1.0 fs. This time step was confirmed to be small enough by a test simulation (microcanonical MD simulation on submodel 1, with the initial temperature set at 300 K),

Backbone and atom packing determining protein dynamics

, (qi – ,qi.) (qj – ,qj.) . 5 (T/2) Σk λk vki vkj (8)

Table I. Switching of energy terms in the backbone submodels

where the relative error of the total energy was ~1310–4 during 1000 steps. For each backbone submodel, a sampling simulation was performed (500 ps at 300 K) after an equilibrium simulation (100 ps at 300 K). Snapshots were stored every 100 steps. The molecular translation and rotation were reset every 1000 steps. Analysis of trajectory We decomposed the thermal, conformational fluctuations into independent, collective modes using a quasi-harmonic method (Teeter and Case, 1990; Kitao et al., 1991; Amadei et al., 1993). An analysis of the thermal fluctuations was performed by comparing the collective modes obtained from a backbone simulation with those obtained from an atomic-level simulation. The quasi-harmonic procedure was performed by diagonalizing a variance–covariance matrix, each element of the matrix being written in the form of a correlation calculated from a trajectory.

where T is the temperature and vki is the ith element of the kth eigen vector. The summation was taken over k. A term λkvkivkj represents the contribution of the kth collective mode, for which the motion is parallel to the kth eigen vector, to the overall correlation. The root mean square (r.m.s.) fluctuation of the ith Cα atom (i.e. ,∆r 2i.1/2) can be calculated from either the simulation trajectory directly or using Equation 8. A set of eigen vectors calculated from a trajectory apparently constructs a 3n-D space. Six eigen vectors of λ 5 0, however, do not contribute to the protein-conformational change. Thus, the eigen vectors essentially construct a (3n 2 6)-D space. Any conformational change can be described by a linear combination of the eigen vectors. The eigen vectors were numbered in descending order of the eigen value in this study. We divided the conformational space into five subspaces: subspace 1 was constructed by the first to fortieth eigen vectors, subspace 2 by the forty-first to eightieth, subspace 3 by the eighty-first to one hundred and twentieth, subspace 4 by the one hundred and twenty-first to one hundred and sixtieth and subspace 5 by the one hundred and sixty-first to one hundred and sixty-eighth eigen vectors. The six eigen vectors of λ 5 0 were excluded from the subspaces. The number of eigen vectors in subspace 1 (i.e. 40) was determined so that the fluctuations within the subspace dominate the overall fluctuations by 90% (results shown later). Subspaces 2–4 were defined by also taking 40 eigen vectors and subspace 5 by the eight eigen vectors remaining. The contribution of subspace I to the overall fluctuation was given as follows:

Cij 5 , (qi – ,qi.) (qj – ,qj.) .

QI 5 SI/Sall

Submodel

1 2 3 4 5

Energy term Etp

Elp

E1–2 1 E1–3

E1–4

On On Off Off Off

Off Off On Off On

On On On On On

On Off On On Off

(7)

Here, Cij is the (i,j)th element of the matrix, ,. represents an average over the trajectory that was obtained from either the backbone or atomic-level simulation and qi is the x, y or z coordinate of a Cα atom, i.e. [q1, ..., q3n] 5 [x1, y1, z1, x2, y2, z2, ..., xn, yn, zn], where n is the number of residues (i.e. 58). The matrix (size 3n 3 3n) is symmetrical (Cij 5 Cji). By diagonalizing the matrix, a set of eigen vectors [v1, ..., v3n] was obtained, where an eigen vector corresponds to an independent, collective mode. These vectors, which were orthogonal to one another, were normalized (i.e. vi · vj 5 δij). The coordinates used for the trajectory average in Equation 7 were those from the conformation for the Cα atoms which were least-squares fitted to the X-ray conformation. The physical meaning of the quasi-harmonic procedure is explained as follows. Given a simulation trajectory on a harmonic potential surface, the kth eigen vector vk corresponds to the kth principal axis on the harmonic surface. A quantity λk–1/2 (λk is the kth eigen value) becomes the energy curvature along the kth principal axis if the trajectory length is long enough. In general, the quasi-harmonic method is equivalent to fitting a conformational, probability distribution function, which was obtained from a trajectory on a Gaussian distribution. For a simulation trajectory of finite length, the eigen values were positive (except for six eigen values of zero, which resulted from the fix of the molecular translation and orientation by the least-square fitting). The larger an eigen value, the broader the distribution along the eigen vector (i.e. the conformational fluctuation along the eigen vector is large). In the quasi-harmonic method, the correlation between qi and qj (i.e. Cij) was given as

(9)

Here SI 5 Σm Σk(I) λk vkm2 and Sall 5 Σm Σk(all) λk vkm2

(10)

The term Σkλkvki2 in Equation 10 was taken from Equation 8 with a setting i 5 j 5 m. The summation Σk(I) was taken over the eigen vectors within subspace I and Σk(all) over all the eigen vectors. The comparison of the fluctuations between the backbone and atomic-level models was performed by a projection between the subspaces. Given subspace I which was obtained from a backbone submodel and subspace J which was obtained from the atomic-level model, the projection PIJ was introduced to estimate how much subspace I was constructed by the eigen vectors in subspace J: PIJ 5 (1/NI) Σi(I) Σj(J) | ui (BB) · vj (AL) |2

(11)

Here, ui(BB) is the ith eigen vector in subspace I (the term BB was assigned to a vector obtained from a backbone submodel) and vj(AL) is the jth eigen vector in subspace J (the term AL was assigned to a vector obtained from the atomic-level model). The first and second summations were taken over the eigen vectors within subspaces I and J respectively and NI is the number of eigen vectors included in subspace I (i.e. NI 5 40 for subspaces 1–4 and NI 5 8 for subspace 5). Because a set of eigen vectors (from the atomiclevel or backbone model) constructs a complete set, PIJ satisfies the following normalization form: ΣJ PIJ 5 1

(12)

The summation was taken over the subspace index J (J 5 1–5). 375

J.Higo and H.Umeyama

The larger the value of PIJ, the more similar are the fluctuations within subspace I (from the backbone model) to those within subspace J (from the atomic-level model), i.e. the overlap between subspaces I and J is large. Results Root mean square fluctuations The root mean square deviation (r.m.s.d.) averaged over the Cα was 1.126 Å between the X-ray and energy-minimized conformations, 0.781 Å between the X-ray and 50 K average conformations and 1.435 Å between the energy-minimized and 50 K average conformations. The 50 K average one was closer to the X-ray one than the energy-minimized one was. Figure 1 shows the r.m.s. fluctuations of the Cα atoms from the backbone simulations and Figure 2 those from the atomiclevel simulations. The averages of the r.m.s. fluctuations were 0.182 Å (for submodel 1), 0.270 (submodel 2), 4.275 (submodel 3), 6.269 (submodel 4) and 7.657 (submodel 5). The fluctuation increased greatly when changing the interaction from Etp to Elp. Note that the scale axes in Figure 1(a)–(e) are different from each other, though the figures show a similar pattern. For the atomic-level simulation, the r.m.s. fluctuation averaged over the Cα atoms was 0.328 Å at 50 K and 1.016 Å at 300 K. In the backbone simulations, there were four regions (regions 1–4; see Figure 1) of relatively large r.m.s. fluctuations. For submodels 3 and 5, the largest fluctuation was at the Nterminus and then the fluctuation of region 1 was positioned on a slope (Figure 1c and e), whereas region 1 also had a large fluctuation. For submodel 3, region 4 had an exceptionally small fluctuation (Figure 1c). In the atomic-level simulations, these four regions also had large fluctuations (Figure 2). The remarkable difference between the 50 and 300 K fluctuations of the atomic-level simulation was shown in region 1, where the fluctuation was largest at 300 K, though the largest was in region 2 at 50 K. In Figure 3, the distance from the molecular centre to each Cα atom is shown for the 50 K average conformation (the molecular centre was calculated only from the positions of the Cα atoms). The four regions specified above were far from the molecular centre and, thus, on the molecular surface. In Figure 2c the r.m.s. fluctuations estimated from the X-ray B-factor are shown, where regions 1–3 also showed large fluctuations, while region 4 did not. The contributions QI (Equation 9) of each subspace to the overall fluctuation are shown in Table II. In any model, Q1 was ~90% or more and, in particular, ~99% for submodels 3–5. If the conformational energy surface is harmonic, the mean square fluctuation ,∆ri2. is proportional to the temperature and satisfies a relation ,∆ri2.300 K/,∆ri2.50 K 5 6.0. However, the ratio was 9.6 (5 1.0162/0.3282) from our atomiclevel simulations. This difference in the ratio can be a measure of the effective anharmonicity in the conformational energy surface (a discussion is given later). Subspace projection The subspace projection (Equation 11) is given in Tables III– VII. For any backbone submodel, P11 was the largest in P1J (where J 5 1–5) and P22 was the largest in P2J, except for submodel 5. If the eigen vectors were randomly oriented (i.e. if the frame of 168 coordinate axes was randomly oriented), the value of |ui(BB) · vj(AL)|2 in equation (11) would become 1/168 5 5.95310–3 on average (where 168 5 3n 2 6) and the diagonal elements PJJ would become 0.238 (5 23.8%) for 376

Fig. 1. The root mean square fluctuations of the Cα atoms from the backbone simulations. (a) Submodel 1, (b) submodel 2, (c) submodel 3, (d) submodel 4 and (e) submodel 5. The broken bars in (a) represent the four regions of large fluctuations: region 1 from the eleventh to eighteenth residues, region 2 from the twenty-fifth to twenty-eighth, region 3 from the thirty-seventh to fortieth and region 4 from the forty-sixth to forty-eighth. The fluctuations were calculated directly from trajectories, without using Equation 8.

J 5 1–4 and 0.048 for J 5 5. These values are smaller than the P11 and P22 in Tables III–VII. Histograms of the eigen values are shown in Figure 4 for the backbone model and in Figure 5 for the atomic-level model. The histograms were considerably different from one another (note that the scale of the vertical axis is different in the histograms). With λ–1 becoming smaller, the molecule becomes more flexible in a mode. The smallest λ–1s were 6.30 (for submodel 1), 2.00 (submodel 2), 0.0123 (submodel 3), 0.0121 (submodel 4), 0.0861 (submodel 5), 0.433 (the atomiclevel model at 50 K) and 0.0581 (at 300 K). The flexibility changed dramatically between submodels 2 and 3. Remember that the r.m.s. fluctuation also changed greatly between submodels 2 and 3. In the atomic-level model, the flexibility increased with increasing temperature. Note that if the conformational energy surface is exactly harmonic, the histogram does not depend on temperature. Discussion In recent work (Teeter and Case, 1990; Horiuchi and Go¯, 1991; Ichiye and Karplus, 1991; Kitao et al., 1991; Amadei et al., 1993), protein conformational fluctuations obtained from atomic-level simulations were decomposed into independent, collective modes, where a relatively small number of modes

Backbone and atom packing determining protein dynamics

Fig. 2. The root mean square fluctuations of the Cα atoms from the atomiclevel simulations and X-ray B-factor. (a) 300 K simulation and (b) 50 K simulation. See the caption to Figure 1 for the definition of the broken bars. The fluctuations were calculated directly from trajectories without using Equation 8. (c) X-ray B-factor.

Fig. 3. Distance from the molecular centre to each Cα atom.

Table II. The contribution QI of each subspace to the overall fluctuationa Model

Submodel 1 Submodel 2 Submodel 3 Submodel 4 Submodel 5 Atomic level (50 K) Atomic level (300 K) aThe

Subspace number 1

2

3

4

5

86.71 84.86 99.87 99.89 99.47 93.94 94.84

9.98 12.49 0.10 0.09 0.46 4.62 3.90

2.38 2.01 0.02 0.01 0.05 1.24 1.04

0.87 0.58 0.01 0.00 0.01 0.19 0.19

0.06 0.05 0.00 0.00 0.00 0.02 0.02

values are in percentages (i.e. 1003QI). The definition of QI is given in Equation 9.

dominated the fluctuations. This dynamic feature was commonly observed in various simulations with different sets of potential functions (Teeter and Case, 1990) and with different conditions (Ichiye and Karplus, 1991) such as in vacuo, in a van der Waals solvent or in a crystal field and in our results as well (Table II). Moreover, our study showed that the tight atom packing is important in keeping the backbone model as rigid as the atomic-level model: by exchanging the interaction from Etp to Elp, the flexibility increased considerably (Table II and Figure 4). This is the reason why the backbone protein by Levy et al. (1984), where the atom packing was not taken into account, was too flexible. The four regions, regions 1–4 as specified in the Results, with large fluctuations were relatively well conserved in both the backbone and atomic-level models (Figures 1 and 2). Furthermore, one may say that the pattern of fluctuation in Figure 1a and b (submodels 1 and 2) was closely similar to that in Figure 2 (the atomic-level model). However, before discussing how the submodels could produce fluctuations similar to those of the atomic-level model, we consider the effect of least-squares fitting in the calculation of the fluctuation. The least-squares fitting is equivalent to the application of the Eckart condition (Eckart, 1935), i.e. fixing the molecular centre and orientation. As a result of fitting, the r.m.s. fluctuations were defined in a body-fixed coordinate system for which the origin was positioned at the molecular centre. The least-squares fitting, commonly used to analyse simulation trajectories, is a useful technique in dividing the overall motion into internal and external motions. This procedure, however, led to the result that residues distant from the molecular centre had large fluctuations, as shown in Figures 1–3. Of course, because the protein surface is more flexible than the interior, large fluctuations are shown for these distant residues (the molecular shape was approximately globular). Consequently, we cannot state that the fluctuations from the backbone simulation were similar to those from the atomic-level simulation just from the agreement in the pattern between Figures 1 and 2. The scaler product |ui(BB) · vj(AL)| in Equation 11 represents the correlation of motion between the two collective modes that belonged to the different models: a large value of |ui(BB) · vj(AL)| indicates that the collective modes represented by ui(BB) and vj(AL) are similar. In Tables III–VII, P11 was the largest of all the P1J (where J 5 1–5) in any backbone submodel. This means that the fluctuations within subspace 1, which dominated the overall fluctuations (Table II), were relatively well correlated between the atomic-level model and the backbone submodels. Thus, we assessed the backbone submodel by P11: the larger the value of P11, the more efficient the submodel is in describing the backbone dynamics. In a comparison between the backbone and 50 K (atomic-level) simulations, submodels 1 and 2 were more efficient than the other submodels. With increasing temperature to 300 K, P11 became smaller. Thus, the fluctuations from the backbone model were better correlated to the 50 K fluctuations than to the 300 K fluctuations. The quasi-harmonic treatment is equivalent to fitting a conformational, probability distribution function from a simulation trajectory to a Gaussian distribution. Generally, the distribution at a low temperature is a better fit to the Gaussian than the distribution at a high temperature, because the harmonic approximation of the potential surface becomes more effective with decreasing temperature. When increasing the temperature 377

J.Higo and H.Umeyama

Table III. Subspace projection between the backbone submodel 1 and the atomic-level modelsa Subspace number (submodel 1)

1 2 3 4 5

Subspace number (atomic-level model) 1

2

61.4 (53.3) 26.5 (26.2) 11.5 (14.0) 0.5 (6.2) 0.2 (1.5)

21.4 35.1 32.7 10.0 3.5

(18.8) (27.2) (32.7) (19.7) (8.0)

3

4

11.1 (13.5) 22.5 (21.4) 29.8 (27.6) 31.9 (30.8) 23.7 (33.1)

5.5 14.0 22.0 46.6 59.3

5 (11.8) (20.4) (22.0) (36.4) (46.4)

0.5 (2.5) 1.9 (4.7) 4.0 (3.7) 11.0 (6.9) 13.2 (11.0)

aThe main values are the subspace projections (Equation 11) between the 50 K fluctuation of the atomic-level model and the fluctuation of submodel I (I 5 1 for this table and I 5 2 for Table IV, for instance). The values in parentheses are those between the 300 K fluctuation of the atomic-level model and the fluctuation of submodel 1. The values are in percentages (i.e. 1003PIJ).

Table IV. Subspace projection between the backbone submodel 2 and atomic-level modelsa Subspace number (submodel 2)

1 2 3 4 5 aSee

Subspace number (atomic-level model) 1

2

62.2 (49.9) 26.0 (26.5) 11.3 (14.8) 0.4 (8.2) 0.1 (2.8)

25.0 40.4 31.1 3.3 1.0

3 (25.4) (30.4) (29.1) (13.6) (7.5)

10.2 24.3 37.3 24.8 16.6

4 (13.8) (22.8) (30.7) (27.8) (24.2)

2.2 8.1 18.4 56.4 74.4

5 (9.5) (17.3) (22.1) (40.7) (52.3)

0.4 1.2 1.8 15.1 7.8

(1.4) (3.0) (3.2) (9.7) (13.1)

the footnote to Table III.

Table V. Subspace projection between the backbone submodel 3 and atomic-level modelsa Subspace number (submodel 3)

Subspace number (atomic-level model) 1

1 2 3 4 5 aSee

56.1 24.6 13.7 5.9 3.2

2 (54.4) (25.5) (13.4) (6.1) (3.2)

25.3 31.2 26.4 15.2 9.5

3 (24.8) (30.5) (24.6) (17.9) (10.6)

12.1 27.4 30.3 24.8 27.2

4 (12.1) (24.7) (30.3) (28.9) (19.6)

5.4 14.8 25.9 44.3 48.0

5 (7.9) (17.1) (27.8) (36.4) (54.0)

1.1 2.1 3.7 10.8 12.1

(0.8) (.2) (3.9) (10.6) (12.7)

the footnote to Table III.

Table VI. Subspace projection between the backbone submodel 4 and atomic-level modelsa Subspace number (submodel 4)

Subspace number (atomic-level model) 1

1 2 3 4 5 aSee

53.1 25.2 13.8 7.0 4.9

2 (52.9) (25.1) (13.4) (7.7) (4.5)

25.1 29.0 26.5 17.0 12.5

3 (25.1) (30.3) (22.8) (19.1) (13.4)

12.6 25.7 28.5 27.6 28.2

4 (12.3) (23.7) (30.1) (28.0) (29.4)

8.1 17.2 26.2 39.2 46.4

5 (8.8) (18.7) (28.5) (36.2) (38.7)

1.1 3.1 5.0 9.2 8.0

(0.8) (2.2) (5.1) (9.0) (14.0)

the footnote to Table III.

Table VII. Subspace projection between the backbone submodel 5 and atomic-level modelsa Subspace number (submodel 5)

1 2 3 4 5 aSee

378

the footnote to Table III.

Subspace number (atomic-level model) 1

2

42.1 (43.9) 27.4 (25.9) 17.5 (16.8) 11.5 (11.6) 7.5 (9.2)

21.3 25.7 24.3 22.4 2.19

3 (21.0) (25.0) (27.4) (22.5) (21.2)

17.8 22.2 26.9 26.6 32.4

4 (17.1) (23.1) (24.8) (29.3) (28.3)

15.9 21.2 26.2 30.4 31.3

5 (15.5) (22.0) (26.1) (30.0) (32.1)

2.9 3.5 5.1 7.0 6.9

(2.5) (4.1) (5.0) (6.7) (9.3)

Backbone and atom packing determining protein dynamics

Fig. 4. Histograms of the eigen values from the backbone model. λ is the eigen value. (a) Submodel 1, (b) submodel 2, (c) submodel 3, (d) submodel 4 and (e) submodel 5.

Fig. 5. Histograms of the eigen values from the atomic-level model. λ is the eigen value. (a) 300 K simulation and (b) 50 K simulation.

from 50 to 300 K in the atomic-level simulation, the molecule became flexible as shown in the ratio ,∆ri2.300 K/,∆ri2.50 K 5 9.6 . 300/50 and P11 decreased (except for submodel 5, the most flexible submodel). This means that some of the collective modes in subspaces 2–5 at 50 K were mixed and shifted into subspace 1 at 300 K and some in subspace 1 at

50 K escaped from subspace 1 at 300 K. Remember that the subspace projection was normalized (Equation 12). An interpretation of the anharmonicity included in the 300 K atomic-level simulation is difficult. Higo (1988) analysed trajectories of the Monte Carlo simulations of a small globular protein (ovomucoid third domain) in vacuo at several temperatures from 90 to 450 K (the simulation started from a relaxed conformation in vacuo, which was obtained from a long equilibrium simulation) and found that the thermal transitions between the energy minima became abruptly frequent above 250 K. Below this temperature the protein behaved approximately as an elastic body, although a few transitions were observed on the protein surface. Consistent results were obtained experimentally (Knapp et al., 1982; Parak et al., 1982; Bauminger et al., 1983; Smith et al., 1990) and computationally (Loncharich and Brooks, 1990; Smith et al., 1990; Steinbach and Brooks, 1993, 1996). Thus, the protein dynamics at 50 K are dominated by harmonic motions and the transition between the energy minima is rare whether the protein is in water or not (Steinbach and Brooks, 1993). In simulations at room temperature (Saito, 1992), which started from an energyminimized conformation near an X-ray conformation, the fluctuations in vacuo involved a conformational deformation induced from a relaxation process in vacuo and in water the fluctuations did not show such a large deformation. Even if enough relaxation was taken for in vacuo simulations, the physical property of the fluctuation at 300 K is probably different between the in water and in vacuo simulations, because the transitions between the energy minima couple with water (Steinback and Brooks, 1993, 1996). Therefore, subspace projection P11 relating to the 300 K simulation in vacuo, which possibly involved the conformational deformation, may be considerably different from the projection in water. The physical origin of the tight atom-packing interaction is easily understandable. A number of atoms, which are tightly packed in the protein interior, were neglected in the backbone model except for the Cα atoms. Thus, two Cα atoms on the main chain cannot be closer than a certain distance. The chain-restoring interaction is indirectly rationalized. A native conformation is stabilized under the balance of a large number of complicated interactions in a real system. Then the chainrestoring interaction can be regarded as an effective interaction resulting from those stabilizing interactions. The chain-restoring and atom-packing interactions were introduced independently. However, the pattern of the interaction network formed by the tight atom-packing interactions is determined by the backbone conformation. Thus, the chainrestoring and tight atom-packing interactions relate to each other. In a given tertiary structure, the backbone cannot overlap and then the backbone fluctuates in movable directions while avoiding the overlap. We presume that the combination of the movable directions determined the collective modes. The distribution of the eigen values depends on the energy contribution terms (i.e. c1–2, c1–3, c1–4, ctp and clp). We simply set c1–2, c1–3 and c1–4 to 300 kJ/mol.Å2 and ctp and clp to 3 kJ/mol in this study and the fluctuations of submodel 1 had a similar pattern to those of the atomic-level model (Figures 1a and 2a). It may be possible to modulate these contribution terms to yield a better agreement of fluctuations, but we did not because each globular protein has its own conformation and the modulated contribution terms may depend on the conformation. We also set the mass of all the 379

J.Higo and H.Umeyama

Cα atoms to the same value. We studied the protein dynamics qualitatively from a general point of view. The chain-restoring interaction (Equations 2–4) always acted on the conformation as a harmonic force, independent of temperature. On the other hand, the tight atom-packing interaction (Equations 5 and 6) was effective only when the distance between two Cα atoms was shorter than a certain value. Thus, the dynamic feature of the backbone model depends on the temperature because of the atom-packing interaction. Although the temperature was set at 300 K in the backbone model, this temperature does not correspond to 300 K in the atomiclevel model. The collective motions (i.e. the eigen vectors) were in relatively good agreement between the backbone and atomiclevel models in vacuo. The r.m.s. fluctuations in region 4, however, were not in agreement between the simulations and X-ray B-factor. Recently, York et al. (1994) performed an MD simulation of BPTI, where the water molecules were explicitly included and the protein was put in a crystal environment. The r.m.s. fluctuations they obtained were in considerable agreement with those from the X-ray B-factor. This means that the function form for atom-pairwise interactions is accurate enough to reproduce the atomic fluctuations, when the calculation is done under appropriate conditions [treatment by electrostatic interactions and the crystal emvionment were the conditions in the work of York et al. (1994)]. The importance of the accurate treatment of interactions is also shown in some other work (Saito, 1992; York et al., 1993; Oda et al., 1996). Although our approach is a convenient technique to obtain the corrective motions qualitatively and roughly, the details of the dynamics cannot be obtained because of the simplicity of the model. However, when the protein to be studied contains some experimental errors or undetermined regions or when the protein is too large to treat accurately, our method is useful. Simplified models have been introduced in chemical physics to extract general and important aspects from complicated phenomena in large systems. Our model highlighted the fact that the backbone conformation and tight atom-packing were determining factors for the collective motions that were generally observed in atomic-level calculations. Acknowledgements We gratefully acknowledge helpful discussions with Drs T.Yoneda, S.Yoneda and K.Kamiya from Kitasato University, Dr A.Kidera from Kyoto University, Dr A.Sarai from RIKEN and Dr Y.Kubota from the Advanced Technology Institute (ATI), Inc.

References Amadei,A., Linssen,A.B.M. and Berendsen,H.J.C. (1993) Proteins, 17, 412– 425. Bauminger,E.R., Cohen,S.G. Nowik,I., Ofer,S. and Yariv,J. (1983) Proc. Natl Acad. Sci. USA, 80, 736–740. Berendsen,H.J.C., Postma,J.P.M., van Gunsteren,W.F., DiNola,A. and Haak,J.R. (1984) J. Chem. Phys., 81, 3684–3690. Bernstein,F.C., Koetzle,T.F., Williams,G.J.B., Meyer,E.F., Brice,M.D., Rodgers,J.R., Kennard,O., Shimanouchi,T. and Tasumi,M. (1977) J. Mol. Biol., 112, 535–542. Brooks,B. and Karplus,M. (1983) Proc. Natl Acad. Sci. USA, 80, 6571–6575. Eckart,C. (1935) Phys. Rev., 47, 552–558. Go¯,N., Noguti,T. and Nishikawa,T. (1983) Proc. Natl Acad. Sci. USA, 80, 3696–3700. Higo,J. (1988) Thesis, Kyushu University, Fukuoka, Japan. Horiuchi,T. and Go¯,N. (1991) Proteins, 10, 106–116. Ichiye,T. and Karplus,M. (1991) Proteins, 11, 205–217. Kitao,A., Hirata,F. and Go¯,N. (1991) Chem. Phys., 158, 447–472. Knapp,E.W., Fischer,S.F. and Parak,F. (1982) J. Phys. Chem., 86, 5042–5047.

380

Kono,H. and Doi,J. (1994) Proteins, 19, 244–255. Levitt,M., Sander,C. and Stern,P.S. (1985) J. Mol. Biol., 181, 423–447. Levy,R.M., Srinivasan,A.R., Olson,W.K. and McCammon,J.A. (1984) Biopolymers, 23, 1099–1112. Loncharich,R.J. and Brooks,B.R. (1990) J. Mol. Biol., 215, 439–455. Marquart,M., Walter,J., Deisenhofer,J., Bode,W. and Huber,R. (1983) Acta Crystallogr., Sect B, 39, 480–490. Noguti,T. and Go¯,N. (1985) Biopolymers, 24, 527–546. Oda,K., Miyagawa,H. and Kitamura,K. (1996) Mol. Simulat., 16, 167–177. Parak,F., Knapp,E.W. and Kucheida,D. (1982) J. Mol. Biol., 161, 177–194. Saito,M. (1992) Mol. Simulat., 8, 321–333. Smith,J., Kuczera,K. and Karplus,M. (1990) Proc. Natl Acad. Sci. USA, 87, 1601–1605. Steinbach,P.J. and Brooks,B.R. (1993) Proc. Natl Acad. Sci. USA, 90, 9135–9139. Steinbach,P.J. and Brooks,B.R. (1996) Proc. Natl Acad. Sci. USA, 93, 55–59. Tanimura,R., Kidera,A. and Nakamura,H. (1994) Protein Sci., 3, 2358–2365. Teeter,M.M. and Case,D.A. (1990) J. Phys. Chem., 94, 8091–8097. Va´squez,M. (1996) Curr. Opin. Struct. Biol., 6., 217–221. Weiner,S.J., Kollman,P.A., Case,D.A., Singh,U.C., Ghio,C., Alagona,G., Profeta,S.J. and Weiner,P. (1984) J. Am. Chem. Soc., 106, 765–784. Yoneda,S. and Umeyama,H. (1992) J. Chem. Phys., 97, 6730–6736. York,D.M., Darden,T.A. and Pedersen,L.G. (1993) J. Chem. Phys., 99, 8345–8348. York,D.M., Wlodawer,A., Pedersen,L.G. and Darden,T.A. (1994) Proc. Natl Acad. Sci. USA, 91, 8715–8718. Received June 27, 1996; revised December 2, 1996; accepted December 6, 1996