SS-mPMG and SS-GA: Tools for Finding Pathways ...

4 downloads 302 Views 814KB Size Report
packages are freely available from our website (http://kanaya. ... Tools for finding pathways and dynamic simulation of metabolic networks at Nara Institute of ...
Tetsuo Katsuragi1,9, Naoaki Ono1,9,*, Keiichi Yasumoto1, Md. Altaf-Ul-Amin1, Masami Y. Hirai2,3, Kansuporn Sriyudthsak2,3, Yuji Sawada2, Yui Yamashita4, Yukako Chiba4,5, Hitoshi Onouchi3,6, Toru Fujiwara3,7, Satoshi Naito4,6, Fumihide Shiraishi8 and Shigehiko Kanaya1 1

Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5, Takayama-cho, Ikoma-shi, Nara 630-0192 Japan 2 Riken Plant Science Center, 1-7-22, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045 Japan 3 JST, CREST, 4-1-8, Honcho, Kawaguchi, Saitama, 332-0012 Japan 4 Graduate School of Life Science, Hokkaido University, Kita-ku, Sapporo, 060-0810 Japan 5 Creative Research Institution, Hokkaido University, Kita-ku, Sapporo, 001-0021 Japan 6 Graduate School of Agriculture, Hokkaido University, Kita-ku, Sapporo, 060-8585 Japan 7 Faculty of Agriculture, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo, 113-8657 Japan 8 Graduate School of Bioresource and bioenvironmental Science, Kyushu University, 6-10-1 Hakozaki, Higashi-ku, Fukuoka, 812-8581 Japan 9 These authors contributed equally to this work. *Corresponding author: Fax: +81-743-72-5329; E-mail, [email protected] (Received December 3, 2012; Accepted March 29, 2013)

Metabolomics analysis tools can provide quantitative information on the concentration of metabolites in an organism. In this paper, we propose the minimum pathway model generator tool for simulating the dynamics of metabolite concentrations (SS-mPMG) and a tool for parameter estimation by genetic algorithm (SS-GA). SS-mPMG can extract a subsystem of the metabolic network from the genome-scale pathway maps to reduce the complexity of the simulation model and automatically construct a dynamic simulator to evaluate the experimentally observed behavior of metabolites. Using this tool, we show that stochastic simulation can reproduce experimentally observed dynamics of amino acid biosynthesis in Arabidopsis thaliana. In this simulation, SS-mPMG extracts the metabolic network subsystem from published databases. The parameters needed for the simulation are determined using a genetic algorithm to fit the simulation results to the experimental data. We expect that SS-mPMG and SS-GA will help researchers to create relevant metabolic networks and carry out simulations of metabolic reactions derived from metabolomics data. Keywords: Arabidopsis thaliana  Bioinformatics  Genetic algorithm  Metabolism  Stochastic simulation. Abbreviations: EC number, enzyme commission number for enzymes; GA, genetic algorithm; PCA, prinicipal component analysis; SD, standard deviation; SS-GA, stochastic simulation with a genetic algorithm; SS-mPMG, stochastic simulation with the minimum pathway model generator; TCA, tricarboxylic acid; UPLC-TQMS, ultraperformance liquid chromatography–tandem quadrupole mass spectrometry

Introduction Recent developments in experimental equipment, such as mass spectrometry, have made it possible to measure the concentrations of hundreds of chemical compounds in a single experiment (Sumner et al. 2003, Bedair et al. 2008, Sawada et al. 2009). Moreover, the development of highthroughput sampling technologies allows us to analyze the time series of those concentration profiles in detail (Buchholz et al. 2002, Sato et al. 2008). Furthermore, given that knowledge of metabolic reactions is available in metabolic pathway databases such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) (http://www.genome.jp/kegg/) (Kanehisa and Goto 2000) or MetaCyc (http://metacyc.org/) (Caspi et al. 2012), we can inform the model of most of the metabolic pathways in which those chemical compounds are involved. If some metabolomics data are generated experimentally, then reverse engineering the system underlying those data might be necessary for making further predictions and retrieving information about other relevant metabolites (Tomita et al. 1999, Edwards et al. 2000, Chassagnole et al. 2002, Feist et al. 2008, de Oliveira Dal’Molin et al. 2010). To achieve that goal, it is necessary to construct a model that can reproduce the experimental data. In fact, a number of studies have attempted to model metabolic systems in a bacterial cell using such data (Tomita et al. 1999, Ishii et al. 2004, Karr et al. 2012). However, even for a single cell, the number of parameters to be determined is huge, and such parameters cannot be directly determined from the experimental data. In the case of plants, it is even more difficult to model a whole cell, because the number

Plant Cell Physiol. 54(5): 728–739 (2013) doi:10.1093/pcp/pct052, available online at www.pcp.oxfordjournals.org ! The Author 2013. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: [email protected]

728

Plant Cell Physiol. 54(5): 728–739 (2013) doi:10.1093/pcp/pct052 ! The Author 2013.

Downloaded from http://pcp.oxfordjournals.org/ at Nara Institute of Science and Technology on May 15, 2013

Special Focus Issue – Regular Paper

SS-mPMG and SS-GA: Tools for Finding Pathways and Dynamic Simulation of Metabolic Networks

Tools for finding pathways and dynamic simulation of metabolic networks

Furthermore, the quantity of molecules is generally measured by relative levels in metabolomics with mass spectrometry, i.e. the time series behavior of metabolites can be measured at the metabolome scale. In this situation, we need to estimate these values to fit the experimental results for the behavior represented by the relative levels in the metabolome. To achieve this, we also implemented a simple method to estimate the model parameters based on a type of evolutionary algorithm, called a genetic algorithm (GA), within the metabolome simulator based on stochastic simulation, SS-GA. In the present study, we describe metabolome stochastic simulation software including pathway selection (SS-mPMG) and parameter estimation by GA (SS-GA) using the metabolome data of A. thaliana. We show that a simple optimization method based on an evolutionary algorithm can be applied to evaluate the parameters and that the model qualitatively reproduces the observed dynamics of intermediate metabolites. These results indicate that SS-mPMG provides a simple sequence of steps for generating a subset of the pathway network from the given genome-wide reaction list, setting the appropriate parameters for simulation, and simulating the obtained network. In addition, the GA is effective for setting the parameters needed for these simulations.

Downloaded from http://pcp.oxfordjournals.org/ at Nara Institute of Science and Technology on May 15, 2013

of metabolic reactions is far greater and the system is more complicated than in bacteria (Morgan et al. 2002, Schwender et al. 2004, Collakova et al. 2012). However, in most cases, it can be useful enough to model a part of metabolic pathways where reactions are dynamically changing. In this study, we propose a tool to extract a subsystem of metabolic networks automatically and simulate a time course of concentration of metabolites within that subsystem. Time processes of well-stirred chemical reaction systems have been simulated by solving a set of coupled ordinary differential equations. Although the deterministic formulation is sufficient in many cases, it fails to capture the inherent stochasticity in many biochemical systems formed by living cells in which the small population of a few critical metabolites can result in discrete and stochastic behavior (McAdams and Arkin 1997, Arkin et al. 1998). The dynamics of those systems can be simulated accurately using the machinery of Markov process theory involving the stochastic simulation algorithm of Gillespie (1976, 1977). Even when the numbers of metabolites are high, stochastic effects can significantly affect the dynamic behavior of certain biochemical networks (Kummer et al. 2005, Pahle et al. 2012). If it is easy to examine the dynamic behavior of metabolites based on stochastic simulations coupled with deterministic simulations, then we will be able to achieve deeper understanding of the systems in the metabolome. This requires the development of both stochastic simulation methods and deterministic simulations. Taking genome-wide or metabolome-wide simulation into consideration, we also need to develop software for selecting proper metabolic chemical equations in metabolic pathways. In the present study, we developed a stochastic simulation with a minimum pathway model generator (SS-mPMG), a program that automatically extracts a subsystem of metabolic pathways relevant to certain experiments from all the pathways of a given organism. To extract a subsystem of metabolic pathways, we assume that there are some compounds which can be regarded as start and end points of the pathways in which a user is interested. Given some metabolites as ‘start’ and ‘end’ compounds, SSmPMG can build a metabolic network by connecting the shortest pathways from the start to end compounds from the given genome-wide list of metabolic reactions. We regard the subnetwork consisting of these shortest pathways as a ‘target’ network to be evaluated, and then carry out a stochastic simulation to predict the metabolism. Because it is difficult to model the reactions of all these metabolites, we statistically selected metabolites that are involved in the observed dynamic behavior and extracted the subnetwork related to the selected components. If we know the estimated values of reaction rate constants for each reaction in the subnetwork, this system can be used to simulate the dynamics of the metabolic reactions and evaluate the model compared with experimental data. However, in general, even with this reduction in the size of the metabolic network, the number of combinations of optimal sets of parameters to fit the simulation result to the experiments is still huge.

Results and Discussion Implementation of stochastic simulation with the minimum pathway model generator (SS-mPMG) and with a genetic algorithm (SS-GA) In the present study, we describe two software packages: SS-mPMG for simulating the behavior of the metabolome using user-defined parameters based on the selection of reactions by minimum pathways, and SS-GA for fitting the parameters to experimental data and simulating the behavior of the metabolome using these parameters. Both of these software packages are freely available from our website (http://kanaya. naist.jp/mPMG). An overview of these software packages is shown in Fig. 1, and detailed instructions for using the software are given in the Supplementary data as well as on our website. This software is also utilized in other omics represented by reaction equations. As shown in Fig. 1, the software consists of three parts: (i) reaction selection by the shortest pathway; (ii) stochastic simulation; and (iii) parameter optimization by GA. We briefly summarize these three parts.

Reaction selection by the shortest pathway SS-mPMG is a tool for finding metabolic pathways related to a targeted set of metabolites and for constructing a model for simulating their behavior. As most pathway databases are huge, it is difficult to simulate all metabolites together. SS-mPMG can extract a subset of the metabolic network from the pathway database, using Dijkstra’s algorithm (Dijkstra 1959) to extract the shortest pathway from the start to end compounds so as to create a map of the network of related metabolites. For

Plant Cell Physiol. 54(5): 728–739 (2013) doi:10.1093/pcp/pct052 ! The Author 2013.

729

T. Katsuragi et al.

SS-GA Software

SS-mPMG Software

Metabolic pathways in a whole cell

Time series obtained from a mass spectrometry

(3) Parameter optimization by GA

Developing and selecting new parameters based on fitness to experimental data

A sub-network

(2) Stochastic Simulation

Simulation of user defined parameters

Simulation based on optimized parameters

Fig. 1 Metabolic network modeling approach using SS-mPMG. To reduce the complexity of the model, a subset of the metabolic network is extracted from the entire network based on the given list of start and end compounds. Kinetic parameters are optimized to reproduce the observed dynamics of metabolite concentrations.

example, when time series data of metabolites are available, the metabolites that decreased or increased the most can be regarded as start and end compounds, respectively. In SS-mPMG, the start and end metabolites are stored in two files, ‘START*.txt’ and ‘END*.txt’, and the metabolic equation file ‘REQ*.txt’ makes it possible to select the metabolic reactions having the shortest pathway based on all combinations of start and end metabolites (see Supplementary data for details).

730

Stochastic simulation based on the Gillespie algorithm The dynamics in the metabolic network can be described by stochastic simulation that emulates fluctuations in metabolic systems. SS-mPMG can predict metabolic dynamics by carrying out numerical simulation on a target metabolic network provided with initial amounts of metabolites and reaction rate constants of the reactions. The Gillespie algorithm

Plant Cell Physiol. 54(5): 728–739 (2013) doi:10.1093/pcp/pct052 ! The Author 2013.

Downloaded from http://pcp.oxfordjournals.org/ at Nara Institute of Science and Technology on May 15, 2013

(1) Reaction selection by shortest pathway

Tools for finding pathways and dynamic simulation of metabolic networks

(Gillespie 1976, Gillespie 1977) implemented in SS-mPMG consists of five steps (Fig. 2A). The simulation system is defined by Step 1. Let us consider a metabolic pathway consisting of M chemical reactions and N metabolites. The j-th reaction is represented by Equation (1): ðPreÞ Xi + . . . +nðPreÞ Rj : nðPreÞ j1 X1 + . . . +nji jN XN !

X1 + . . . + nðPostÞ Xi + . . . +nðPostÞ XN : nðPostÞ j1 ji jN

ð1Þ

hj ¼ cj ½Xi    hj ¼ cj Xi ð½Xi   1Þ=2   hj ¼ cj ½Xi  Xi0

for Rj : Xi ! Xi0

ð2aÞ

for Rj : 2Xi ! Xi0

ð2bÞ

ð2cÞ for Rj : Xi +Xi0 ! Xi00 PM The sum of the hazard functions (H ¼ j¼1 hj ) is then used to calculate the time step t, by Equation (3), and the probability of occurrence of the j-th reaction is obtained by Equation (4): t ¼ ð1=HÞlogð Þ

ð3Þ

pj ¼ hj =H

ð4Þ

Here, t is a random variable drawn from a uniform distribution on the interval [0, 1]. In Step 3, a reaction j is randomly selected from the probability distribution P(X = j) = pj. In Step 4, the molecular concentrations are revised according to the selected reaction equation as represented in Equation (5): ½Xi 0

½Xi   nðPreÞ +nðPostÞ ji ji

ð5Þ

If the [Xi]0 satisfies Ximin  [Xi]0  Ximax for all i, then the metabolite concentrations are updated and t is advanced to t + t, and the algorithm returns to Step 2; otherwise, the reaction is rejected and Step 3 is repeated to select another reaction. These steps are repeated until the event time becomes larger than the maximum time T. The advantage of Gillespie’s algorithm is that the simulation can be carried out if a system is represented by a set of reaction equations, including metabolite regulation by proteins and gene expression. For example, autoregulation (as shown in Fig. 2B) and feedback inhibition (Fig. 2C), which can be represented by a set of reaction equations, are also dynamically simulated by the present software.

To simulate the dynamics of metabolic pathways, the initial concentration of every compound and the rate constant for every metabolic reaction have to be estimated. However, in general, measuring the exact values of all those parameters experimentally is difficult. Therefore, numerical optimization methods to find the best values reproducing the observed behavior of metabolite concentrations are required. In this study, as an example, we implemented a method based on an evolutionary search algorithm because it can be applied to a wide range of problems. A GA is used for the combinatorial optimization problem, mimicking evolution in nature, as first proposed by Holland (1975). A point in the search space for the target problem is represented by a character string called a chromosome. An individual has a chromosome and a fitness value with a given evaluation function. First, the GA randomly generates a set of individuals called the population. The population is evolved by repeating evaluation, selection, crossover and mutation. Consequently, a semi-optimal solution is obtained. GAs have been applied to chemical and biological sciences (Yen et al. 1995, Morbiducci et al. 2005). Sheridan and Kearskey (1995) demonstrated that a GA can quickly select a semi-optimal subset of amines for constructing a tripeptoid library. Jones et al. (1995) used a GA to explore the full conformational flexibility of the ligand with partial flexibility of the protein. Because in our model the values to be optimized, namely the initial concentrations and reaction coefficients, are continuous, the genotype is an array of real values, and mutation is implemented as random continuous change of one or several values. These types of evolutionary algorithms are called real-valued GAs and have been applied to various optimization problems (Corcoran and Sen 1994, Huang et al. 1997, Wu et al. 2007) In general, when using mass spectrometry, it is difficult to measure the absolute metabolite concentration because the extraction and inonization efficiency may depend on the chemical properties of each metabolite. Therefore, even though the metabolite concentrations are computed explicitly in our model, the values derived cannot be directly compared with experimental results. To avoid this problem, in this study, we focused on relative changes of rate constants rather than on their absolute magnitude. In other words, despite some residual arbitrariness in the scale of metabolite concentration, the relative quantity such as the ratio of products to substrates can be properly reproduced in the dynamic model. The software developed in the present study (SS-GA) optimizes a set of the variables for the initial amounts of each metabolite and the rate constants of each reaction so as to reproduce the dynamics of metabolite concentration observed using mass spectrometers. Detailed instructions for SS-GA are given in the Supplementary data, which are also available on our website. As shown in Fig. 3, an individual’s chromosome is represented by an array of real values, namely the initial metabolite concentrations and the reaction rate constants

Plant Cell Physiol. 54(5): 728–739 (2013) doi:10.1093/pcp/pct052 ! The Author 2013.

Downloaded from http://pcp.oxfordjournals.org/ at Nara Institute of Science and Technology on May 15, 2013

and nðPostÞ are Here, Xi represents the i-th metabolite and nðPreÞ ji ji stoichiometric coefficients for substrates and products. When the j-th reaction occurs, the concentration of the i-th metab nðPreÞ . The following parameters are olite changes by nðPostÞ ji ji set by the user: the maximum time for the stochastic simulation T, the initial quantities of the metabolites XiðinitÞ , and metabolite bounds Ximin , Ximax , such that Xi 2[Ximin , Ximax ] for the i-th metabolite (i = 1, 2, . . . , N), and the reaction rate cj for Rj. In Step 2, all metabolite concentrations are fed into the variable [Xi]. In Step 3, the hazard function for the j-th reaction is calculated by Equation (2):

Parameter optimization process

731

T. Katsuragi et al.

A Step 1: System Definition (1) Max time, T (2) Intervals for individual metabolites, Ximin, Ximax (3) Initial quantities of metabolites Xi’ Xi(init) (i = 1, 2,…, N) (4) Reaction equations Rj: m(Pre)j1 X 1 + ...+ m(Pre)jN XN = m (Post) j1 X 1 +...+ m (Post) jN XN (5) Reaction rate: cj for Rj

(j=1,2,…, K)

Step 2: [Xi] X Step 3: Determining time scale and probabilities of occurrence of reactions hj= current status of j-th reaction based on cj (j=1,2,.., K) and [Xi] (i=1,2,..,N) N =1

t = -(1/H) log (T) Reaction probability pj = hi/H Step 4: Random selection of reaction equation Here a selected reaction is represented by index s. Step 5: Quantities for individual objects are revised based on selected reaction equation X’i [Xi] – m (Pre)si + m(Post)si

Ximix≤ X’i ≤ Ximax

Step 6: t

t+

T1.96 SD of the changes of all the metabolites (i.e. in the 5% rejection range), and selected them as the start or end compounds (Table 1). The metabolites chosen as start compounds were mannitol and raffinose; those chosen as end compounds were L-proline, saccharopine, phosphorylcholine, CDP-choline, Lthreonine, shikimate, 50 -methylthioadenosine, L-homocysteine and S-adenosyl-L-methionine.

To estimate the values of the 171 parameters, we optimized them to fit the simulation results to the experimental data generated using the GA. Twenty-six of the 268 detected metabolites were involved in the reactions of the selected subnetwork. Because of the difficulty in measuring the absolute concentrations of each metabolite quantitatively, we focused on relative changes from the time point 0 h, which gave us nine data points to evaluate. In total, we had 225 data points for the 26 metabolites at nine sample points, excluding nine missing values in the experimental data. To compare these data points with the observed data, we ran a simulation using a given set of the parameters, and computed the change in the amount of each metabolite compared with the initial amounts at 0 h. We then evaluated the residual error between these simulated changes and observed changes to compute the fitness of the model [Equations (7–9)]. The detailed algorithm is described in the section ‘Parameter optimization process’. Fig. 6 shows the histogram of the final fitness values of 50 independent optimization runs after 510 generations, i.e. 10 generations of the preliminary optimization and 500 generations of secondary optimization. The highest correlation between observed and simulated changes in metabolite amounts was r = 0.956. A detailed comparison of the results of the stochastic simulation based on the parameters with the highest fitness with the observed data is shown in Supplementary Fig. S1. However, as the results of the optimization were not always identical because of the stochastic evaluation of the model, we use the top 10 results (the threshold is indicated by the dashed line in Fig. 6) to evaluate reproducibility.

Metabolism in the experiment explained by the predictive simulation To understand the changes in the dynamics between the former (0–24 h) and the latter (24–84 h) phases of the metabolism, we defined the changes of reaction rate constants relative to their values in the former phase as follows: vg ¼ ðvg1 , . . . , vg171 Þ conc for for lat lat ¼ ðvconc g1 , . . . , vgN , vg1 , . . . , vgM , vg1 . . . , vgM Þ for wgj ¼ log10 ðvlat g j =vg j Þ,

ð11Þ ð12Þ

for lat where vconc g j , vg j , vg j represent the parameter values of initial concentration and rate constants in the former and latter phases, respectively, obtained from the g-th optimization run. The ratio wgj represents the change in the rate constant. We compared the ratio wgj in the results of the top 10 optimizations by using a t-test and found that it significantly changed in eight of the reactions (Table 2). Fig. 7 shows the

Plant Cell Physiol. 54(5): 728–739 (2013) doi:10.1093/pcp/pct052 ! The Author 2013.

735

T. Katsuragi et al.

mannitol

raffinose

D-mannose

sucrose

D-mannose-6-phosphate D-fructose

D-fructose-6-phosphate D-xylulose-5-phosphate fructose-1,6-bisphosphate

D-sedoheptulose-7-phosphate

dihydroxyacetone phosphate

3-phosphoglycerate 3-deoxy-D-arabino-heptulosonate-7-phosphate 3-phospho-hydroxypyruvate

3-dehydroquinate

2-phosphoglycerate

L-glutamate cystathionine

3-dehydro-shikimate phosphoenolpyruvate

alpha-ketoglutarate

cis-aconitate

3-phospho-serine

pyruvate

L-serine

citrate

shikimate isocitrate

acetyl-CoA alpha-ketoglutarate

L-lysine

oxaloacetate

L-homocysteine L-glutamate

succinyl-CoA

malate

saccharopine

L-methionine fumarate

alpha-ketoglutarate

succinate alpha-aminoadipate

L-aspartate S-adenosyl-L-methionine

L-glutamate

L-aspartyl-4-phosphate N-dimethylethanolamine phosphate

N-acetyl-L-ornithine L-aspartate-semialdehyde L-ornithine

homoserine phosphoryl-choline

O-phospho-L-homoserine

L-glutamate gamma-semialdehyde

putrescine

S-adenosyl-L-methioninamine CDP-choline

5'-methylthioadenosine

L-threonine

pyrroline 5-carboxylate L-proline

Fig. 5 The extracted metabolic network. The start and end compounds selected based on the results of the experiment are colored blue and red, respectively. The reactions whose rate constants changed significantly from the former phase (0–24 h) to the latter phase (24–84 h) are colored red (increased) and blue (decreased).

distribution of wgj for these eight reactions. To explain the observed changes in the metabolite concentrations, it was required to change the rate constants of these parameters between the former and the latter phases. These reactions are also illustrated in Fig. 5 by red/blue arrows. It is worth noting that the changes in reactions consuming mannitol

736

and raffinose are consistent with the fact that mannitol was mostly consumed in the former phase and raffinose was mostly consumed in the latter phase. This result suggests that the consumption of resource materials triggered some feedback on the activity on these reactions. In contrast, the rate constants of the remaining reactions did not

Plant Cell Physiol. 54(5): 728–739 (2013) doi:10.1093/pcp/pct052 ! The Author 2013.

Downloaded from http://pcp.oxfordjournals.org/ at Nara Institute of Science and Technology on May 15, 2013

D-erythrose-4-phosphate

D-glyceraldehyde-3-phosphate

Tools for finding pathways and dynamic simulation of metabolic networks

change significantly. This result implies that when metabolic networks adapt to environmental change, they do not change their activity in the majority of pathways, but only in very specific pathways. This is consistent with the 14

Frequency

12

experimental results that under various types of genetic and environmental perturbations, protein and metabolite levels were very stable, reflecting the robustness of metabolic networks (Ishii et al. 2007). Segre` et al. (2002) also proposed a model to predict changes in fluxes of perturbed metabolic networks by introducing the method of minimization of metabolic adjustment. These studies showed that when

10 8

Table 2 Reactions whose rate constants changed significantly

6 4 Initial fitness

0 0

0.1

0.2

0.3 0.4 Fitness

0.5

0.6

0.7

Fig. 6 Distribution of fitness values after 50 independent optimization runs. In all 50 runs, fitness increased significantly from the initial value. This result indicates that the model reproduced the changes in metabolite concentrations very well. The top 10 results (higher than the threshold shown by the dashed line) were used to analyze the changes in reaction rate constants.

6 4 2

Shikimate dehydrogenase

5.4  102

6.0  104

3

4

2.4.1.–

2.1  10

1.1.1.37

3.0

1.5  102

Malate-oxaloacetate shuttle II

2.7.7.15

8.8  101

2.8  102

Choline-phosphate cytidylyltransferase

2.5.1.6

6.3  101

3.6  102

Methionine adenosyltransferase

2

4.1  102

Sucrose degradation III

3.2.1.26

2.6  10

8.2  10

6 4

−2

0 w gj

2

10

4

6

6 4

−2

0 w gj

2

4

6

6 4 2

0 −4

−2

0 w gj

2

10

4

6

6 4 2

−4

−2

0 w gj

2

10

0 w gj

2

10

4

6

EC: 2.7.7.15 p=2.8 × 10−2

8 6 4

4

6

−6

−4

−2

0 w gj

2

4

6

EC: 3.2.1.26 p=4.1 × 10−2

8 Frequency

8

−2

0 −6

EC: 2.5.1.6 p=3.6 × 10−2

−4

2

0 −6

−6

EC: 1.1.1.37 p=1.5 × 10−2

8

2

4

0 −4

10 Frequency

8

6

2 −6

EC: 2.4.1.− p=8.2 × 10−4

EC: 1.1.1.25 p=6 × 10−4

8

Frequency

−4

Ajugose biosynthesis II

10

EC: 1.1.1.255 p=1.3 × 10−4

0 −6

Frequency

Mannitol dehydrogenase

1.1.1.25

2

0

Frequency

Threonine synthase

1.3  104

5.3  10

8 Frequency

Frequency

8

1.3  105

2

1.1.1.255

10

EC: 4.2.3.1 p=1.3 × 10−5

1.1  102

4.2.3.1

Frequency

10

Function

Mean ratio

Downloaded from http://pcp.oxfordjournals.org/ at Nara Institute of Science and Technology on May 15, 2013

2

P-value

EC number

6 4 2

0

0 −6

−4

−2

0 w gj

2

4

6

−6

−4

−2

0 w gj

2

4

6

Fig. 7 Distributions of the changes in reaction rate constants. Among the optimization results of the top 10 fitness values, the rate constants of these eight reactions changed between the former and the latter phases in the same direction, almost always.

Plant Cell Physiol. 54(5): 728–739 (2013) doi:10.1093/pcp/pct052 ! The Author 2013.

737

T. Katsuragi et al.

Materials and Methods Plant materials Arabidopsis liquid callus culture derived from accession Col-0 was prepared as described in Murota et al. (2011) with some modifications. For callus induction, minced seedlings in RM28 medium were incubated under constant light. The medium was changed once every 6 d. After the third medium transfer, the callus was cultivated with 10 mM L-lysine plus 1 mM L-threonine (treated) or without treatment (control). The callus was collected prior to L-lysine and L-threonine treatment (0 h), and at 2, 6, 12, 24, 36, 48, 60, 72, 84 and 96 h after treatment, frozen in liquid nitrogen and stored at 80 C until use.

Metabolome measurement by mass spectrometry Three sets of each sample were extracted and analyzed using UPLC–TQMS (Sawada et al. 2009). A 2 mg aliquot of fresh plant tissues was extracted using a bead shocker in buffer (500 ml, 80% MeOH) for 5 min. The extracted liquid (250 ml) was dispersed and dried up in a 96-well plate, and then redissolved in 120 ml of H2O. Finally, 1 ml of sample was subjected UPLC-TQMS. Three analytical replication were conducted.

Metabolic reaction data As the reactions connecting these metabolites are not selfevidently identified, AraCyc was used to fill the connections between them. However, as most metabolites are known by multiple names, the names used in the experimental data and those in AraCyc do not necessarily correspond. To solve this problem, we used the identifier (C_ID) in the KNApSAcK Core database (http://kanaya.naist.jp/knapsack_jsp/top.html), the CAS number assigned by the Chemical Abstract Service (a division of the American Chemical Society) and the synonyms in the MetaCyc database.

738

Supplementary data Supplementary data are available at PCP online.

Funding The National Bioscience Database Center in Japan; the Ministry of Education, Culture, Sport, Science and Technology of Japan [Grant-in-Aid for Scientific Research on Innovation Areas, ‘Biosynthetic Machinery. Deciphering and Regulation the System for Creating Structural Diversity of Bioactive metabolites (2007)’]; The Japan Science and Technology agency [CREST, Strategic International Collaborative Research Program, SICORP for JP-US Metabolomics].

References Arkin, A., Ross, J. and McAdams, H.H. (1998) Stochastic kinetic analysis of developmental pathway bifurcation in phase -infected Escherichia coli cells. Genetics 149: 1633–1648. Bedair, M. and Sumner, L.W. (2008) Current and emerging massspectrometry technologies for metabolomics. TrAC Trends Anal. Chem. 27: 238–250. Buchanan, B.B., Gruissem, W. and Jones, R.L. (2000) In Biochemistry and molecular biology of plants. American Society of Plant Physiologists, Rockville. Buchholz, A., Hurlebaus, J., Wandrey, C. and Takors, R. (2002) Metabolomics: quantification of intracellular metabolite dynamics. Biomol. Eng. 19: 5–15. Caspi, R., Altman, T., Dreher, K., Fulcher, C.A., Subhraveti, P., Keseler, I.M. et al. (2012) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 40: D742–D753. Chassagnole, C., Noisommit-Rizzi, N., Schmid, J.W., Mauch, K. and Reuss, M. (2002) Dynamic modeling of the central carbon metabolism of Escherichia coli. Biotechnol. Bioeng. 79: 53–73. Collakova, E., Yen, J.Y. and Senger, R.S. (2012) Are we ready for genome-scale modeling in plants? Plant Sci. 191–192: 53–70. Corcoran, A.L. and Sen, S. (1994) Using real-valued genetic algorithms to evolve rule sets for classification. In Proceedings of the First IEEE Conference on Evolutionary Computation. IEEE World Congress on Computational Intelligence. pp. 120–124. IEEE. De Oliveira Dal’Molin, C.G., Quek, L.-E., Palfreyman, R.W., Brumbley, S.M. and Nielsen, L.K. (2010) AraGEM, a genome-scale reconstruction of the primary metabolic network in Arabidopsis. Plant Physiol. 152: 579–589. Dijkstra, E.W. (1959) A note on two problems in connexion with graphs. Numer. Math. 1: 269–271. Edwards, J.S. and Palsson, B.O. (2000) The Escherichia coli MG1655 in silico metabolic genotype: its definition, characteristics, and capabilities. Proc. Natl Acad. Sci. USA 97: 5528–5533. Feist, A.M. and Palsson, B.Ø. (2008) The growing scope of applications of genome-scale metabolic reconstructions using Escherichia coli. Nat. Biotechnol. 26: 659–667. Gillespie, D.T. (1976) A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J. Comput. Phys. 22: 403–434.

Plant Cell Physiol. 54(5): 728–739 (2013) doi:10.1093/pcp/pct052 ! The Author 2013.

Downloaded from http://pcp.oxfordjournals.org/ at Nara Institute of Science and Technology on May 15, 2013

cells adapt to environmental change, they tend to prefer the smallest change needed to sustain their metabolic status. Our results also suggest that parameter optimization based on a GA, which searches locally for an optimum starting from the given condition, will be predictive and effective for the analysis of metabolic reaction networks. In general, it is difficult to model a large metabolic network quantitatively, even with massive amounts of metabolomics data obtained from mass spectrometry; therefore, it would be useful to extract a subnetwork in which interesting metabolites are involved in order to reduce the number of parameters to be estimated. In this study, we showed that parameter optimization based on a simple GA can reproduce the observed behavior of metabolite amounts. Along this line, more sophisticated evolutionary algorithms or other optimization methods could provide convenient tools for evaluating the rate constants of metabolic reactions and for understanding the detailed dynamics that are not directly visible in the behavior observed in the experimental results.

Tools for finding pathways and dynamic simulation of metabolic networks

ACE, a new in vitro translation system derived from Arabidopsis callus cultures. Plant Cell Physiol. 52: 1443–1453. Pahle, J., Challenger, J.D., Mendes, P. and McKane, A.J. (2012) Biochemical fluctuations, optimisation and the linear noise approximation. BMC Syst. Biol. 6: 86. Sato, S., Arita, M., Soga, T., Nishioka, T. and Tomita, M. (2008) Timeresolved metabolomics reveals metabolic modulation in rice foliage. BMC Syst. Biol. 2: 51. Sawada, Y., Akiyama, K., Sakata, A., Kuwahara, A., Otsuki, H., Sakurai, T. et al. (2009) Widely targeted metabolomics based on large-scale MS/MS data for elucidating metabolite accumulation patterns in plants. Plant Cell Physiol. 50: 37–47. Schwender, J., Ohlrogge, J. and Shachar-Hill, Y. (2004) Understanding flux in plant metabolic networks. Curr. Opin. Plant Biol. 7: 309–317. Segre`, D., Vitkup, D. and Church, G.M. (2002) Analysis of optimality in natural and perturbed metabolic networks. Proc. Natl Acad. Sci. USA 99: 15112–15117. Sheridan, R.P. and Kearsley, S.K. (1995) Using a genetic algorithm to suggest combinatorial libraries. J. Chem. Inform. Model. 35: 310–320. Sumner, L.W., Mendes, P. and Dixon, R.A. (2003) Plant metabolomics: large-scale phytochemistry in the functional genomics era. Phytochemistry 62: 817–836. Tomita, M., Hashimoto, K., Takahashi, K., Shimizu, T.S., Matsuzaki, Y., Miyoshi, F. et al. (1999) E-CELL: software environment for whole-cell simulation. Bioinformatics 15: 72–84. Wu, C.H., Tzeng, G.H., Goo, Y.J. and Fang, W.C. (2007) A real-valued genetic algorithm to optimize the parameters of support vector machine for predicting bankruptcy. Expert Syst. Appl. 32: 397–408. Yen, J., Randolph, D., Liao, J.C. and Lee, B. (1995) A hybrid approach to modeling metabolic systems using genetic algorithm and simplex method. In Proceedings of the 11th Conference on Artificial Intelligence for Applications. pp. 277–283. IEEE Conference Publications.

Plant Cell Physiol. 54(5): 728–739 (2013) doi:10.1093/pcp/pct052 ! The Author 2013.

Downloaded from http://pcp.oxfordjournals.org/ at Nara Institute of Science and Technology on May 15, 2013

Gillespie, D. (1977) Exact stochastic simulation of coupled chemical reactions. J. Phys. Chem. 81: 2340–2361. Holland, J.H. (1975) Adaptation in Natural and Artificial Systems. MIT Press, Cambridge, MA. Huang, Y.P. and Huang, C.H. (1997) Real-valued genetic algorithms for fuzzy grey prediction system. Fuzzy Sets Syst. 87: 265–276. Ishii, N., Nakahigashi, K., Baba, T., Robert, M., Soga, T., Kanai, A. et al. (2007) Multiple high-throughput analyses monitor the response of E. coli to perturbations. Science 316: 593–597. Ishii, N., Robert, M., Nakayama, Y., Kanai, A. and Tomita, M. (2004) Toward large-scale modeling of the microbial cell for computer simulation. J. Biotechnol. 113: 281–294. Jones, G., Willett, P. and Glen, R.C. (1995) Molecular recognition of receptor sites using a genetic algorithm with a description of desolvation. J. Mol. Biol. 245: 43–53. Kanehisa, M. and Goto, S. (2000) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28: 27–30. Karr, J.R., Sanghvi, J.C., Macklin, D.N., Gutschow, M.V., Jacobs, J.M., Bolival, B. Jr et al. (2012) A whole-cell computational model predicts phenotype from genotype. Cell 150: 389–401. Kummer, U., Krajnc, B., Pahle, J., Green, A.K., Dixon, C.J. and Marhl, M. (2005) Transition from stochastic to deterministic behavior in calcium oscillations. Biophys J. 89: 1603–11. McAdams, H.H. and Arkin, A. (1997) Stochastic mechanisms in gene expression. Proc. Natl. Acad. Sci. USA 94: 814–819. Morbiducci, U., Tura, A. and Grigioni, M. (2005) Genetic algorithms for parameter estimation in mathematical modeling of glucose metabolism. Comput. Biol. Med. 35: 862–874. Morgan, J.A. and Rhodes, D. (2002) Mathematical modeling of plant metabolic pathways. Metab. Eng. 4: 80–89. Murota, K., Hagiwara-Komoda, Y., Komoda, K., Onouchi, H., Ishikawa, M. and Naito, S. (2011) Arabidopsis cell-free extract,

739