Experimental Flux Measurements on a Network Scale

Metabolic flux is a fundamental property of living organisms. In recent years, methods for measuring metabolic flux in plants on a network scale have evolved further. One major challenge in studying flux in plants is the complexity of the plant’s metabolism. In particular, in the presence of parallel pathways in multiple cellular compartments, the core of plant central metabolism constitutes a complex network. Hence, a common problem with the reliability of the contemporary results of 13C-Metabolic Flux Analysis in plants is the substantial reduction in complexity that must be included in the simulated networks; this omission partly is due to limitations in computational simulations. Here, I discuss recent emerging strategies that will better address these shortcomings.


INTRODUCTION
Isotopic tracers have different but important uses in metabolic research. Among the various approaches to stoichiometrical modeling of cell metabolism (Llaneras and Pico, 2008), 13 C-Metabolic Flux Analysis ( 13 C-MFA) is a method that combines a knowledge of cell metabolism with 13 C-tracer experiments to analyze the in vivo flux distribution in the network of central cellular primary metabolism. It affords us a quantitative integrated view of core metabolism (Koschutzki et al., 2010) that unravels the in vivo function of biochemical pathways under different physiological conditions, or reveals the effect of manipulation by transgenic approaches. In plants, 13 C-MFA mostly is applied to cultures of cells or tissue, growing heterotrophically or photoheterotrophically on 13 C-labeled substrates. The increasing number of studies on different species over the last 10-to 15-years documents the development of researches with 13 C-MFA in plants. Maize root tips, detached from germinating seeds, were first used as a model to study energy metabolism in non-photosynthetic tissues (Dieuaide-Noubhani et al., 1995;Alonso et al., 2005Alonso et al., , 2007b. Other studies used the hairy root cultures of Catharanthus roseus, the Madagascar periwinkle (Sriram et al., 2007), and cell-suspension cultures of tomato or Arabidopsis thaliana (Rontein et al., 2002;Williams et al., 2008;Masakapalli et al., 2010). Various studies focused on the distribution of flux in central metabolism in the developing seeds and embryos of rapeseed and Arabidopsis (Schwender and Ohlrogge, 2002;Schwender et al., 2003Schwender et al., , 2004aSchwender et al., , 2006Junker et al., 2007;Lonien and Schwender, 2009), soybean (Sriram et al., 2004;Iyer et al., 2008;Allen et al., 2009b), sunflower (Alonso et al., 2007a), and in developing maize endosperm or embryos (Ettenhuber et al., 2005;Spielbauer et al., 2006;Alonso et al., 2010Alonso et al., , 2011. Several studies also began to assess the effect of physiological-or genotypical-perturbations of central metabolism (Alonso et al., 2007b;Junker et al., 2007;Iyer et al., 2008;Williams et al., 2008;Lonien and Schwender, 2009). Recent studies began to explore the synergy between plant 13 C-MFA and the more predictive modeling approach of flux balance analysis (FBA; Williams et al., 2010;Hay and Schwender, 2011a,b).
A great deal of biological knowledge about an organism is needed to construct a model of its biochemical network. Even in the post-genomic age, the definition of metabolic networks is not straightforward . Yet the results of the analytic process critically depend upon having a realistic network (van Winden et al., 2001a;Schwender et al., 2004b;Masakapalli et al., 2010). Due to the practices of computational analysis, the typical scale of a 13 C-MFA network ( Table 1) results from tailoring to a smaller size the detailed topology inferred from literature, e.g., by lumping the metabolite pools present in different subcellular compartments.
This paper offers some insights into the experimental-and computational-modeling practices of 13 C-MFA to highlight the typical assumptions built into such models, and to discuss how their constructions and their general reliability can be improved. I discuss modeling related to applying 13CFLUX www.13cflux.net), a software used by many groups in the field. The paper is not intended to be a comprehensive review of all recent work, but rather, to give my personal perspective based on the practice of experimental-and computational-modeling of plant central metabolism.

PRINCIPLE OF STEADY-STATE 13 C-MFA
Several recent detailed reviews summarize experimental procedures, the modeling process, as well as discuss important biological insights that have resulted from plant 13 C-MFA studies, e.g., Shachar-Hill (2008), Schwender (2008), Kruger and Ratcliffe (2009), Allen et al. (2009a. Figure 1 illustrates a general experimental workflow. Zamboni et al. (2009) gave a very detailed and useful description of 13 C-MFA, including a tutorial for 13CFLUX. In short, an organism is grown on a minimal culture medium with welldefined composition of organic-and inorganic-substrates. While  Zamboni et al. (2009). 2 Schwender et al. (2006). 3 Lonien and Schwender (2009). 4 Hay and Schwender (2011a,b). 5 Poolman et al. (2009) a 13 C-labeled carbon source (e.g., [1-13 C]glucose) is being metabolized, 12 C-and 13 C-atoms are distributed throughout the organisms' metabolic network. The fate of a 13 C-labeled carbon position of the carbon source, or of pairs of adjacent 13 C-atoms ( 13 C-13 C bond label) is traced through the network by detecting the labeling signatures of the intermediary metabolites by the techniques of mass spectrometry (MS; Dauner and Sauer, 2000;Schwender and Ohlrogge, 2002) or nuclear magnetic resonance (NMR; Dieuaide-Noubhani et al., 1995;Szyperski, 1995). For the widely used approach of metabolic-and isotopic-stationary 13 C-MFA, the essential prerequisite is that the labeling state of each metabolite attains a steady-state before the cells are harvested, the metabolites extracted, and the labeling signatures analyzed (Wiechert, 2001). Thus, information is gained about intracellular fluxes for alternative pathways converging to the same metabolite (Szyperski, 1995;van Winden et al., 2001b), meaning that different labeling signatures are generated and mixed in one metabolite at the convergent node. For example, oxaloacetate (OxA) may be labeled differentially depending on whether it is formed by the carboxylation of phosphoenol pyruvate, or from α-ketoglutarate via the reactions of the tricarboxylic acid (TCA) cycle ( Figure 2B). Whether we can evaluate the flux ratio at the OxA node rests upon the particular 13 C-substrate label used in the culture. Other nodes of this kind are pyruvate, α-ketoglutarate, and 3-phosphoglyceric acid. Often the labeling pattern in these intermediates is not measured directly, but accessed indirectly through their anabolic products (Szyperski, 1995(Szyperski, , 1998. Asp, for example, accumulates in protein and represents the labeling signature in OxA (Figure 2A). For studying plant flux, the analyses of protein-bound amino acids by NMR, or by gas chromatography/MS (GC/MS) methods, have emerged as standard practices (Allen and Ratcliffe, 2009).  (Hay and Schwender, 2011a). Arrows depict individual reactions that are formulated in bna572 by a complete reaction stoichiometry. To make the topology understandable, in most cases only one substrate to product transition is shown for each reaction. Thick arrows indicate that two pools are inter-converted by multiple reactions. Sets of metabolite pools that are lumped in the 13 C-MFA model are highlighted in gray. (B) Representation of the network for a related 13 C-MFA model (Schwender et al., 2006) showing the carbon backbones of the metabolites. Biochemical reactions are carbon transitions as connecting arrows (double-headed arrows for reversible reactions). For succinate (Succ), the symmetric randomization of carbon atoms is indicated that is accomplished in the model by two reactions (e.g.,

DEFINING THE MODEL BOUNDARIES
A 13 C-MFA experiment allows to explore the distribution of in vivo flux under a particular physiological condition. For all organic substrates present in the medium, such as sucrose or glutamine, their uptake reactions must be defined. Furthermore, by quantitatively breaking down the cell components, we can identify the most abundant compounds to result from biosynthetic fluxes (Figure 1). This approach defines several biomass drain fluxes that are responsible for cell growth. Typically neglected are the growth demands for synthesizing a multitude of low-abundance free intermediary metabolites, as well as enzyme cofactors, pigments, and phytohormones. The inclusion of such minor compounds into the metabolic network would not significantly affect the flux distribution in central metabolism. Finally, measurements of growth kinetics can serve to scale the model fluxes relative to a specific growth rate.

ENCODING BIOCHEMICAL REACTIONS
In 13 C-MFA all reaction stoichiometries must be augmented by carbon transitions. Any particular biochemical reaction may be formulated as a set of carbon-atom transitions, defining how each one moves between the main substrates and products. For example, a textual notation following the style of Wiechert and de Graaf (1996) for the carboxylation of phosphoenol pyruvate (PEP) is with A, B, and C respectively denoting the carbons one, two, and three of PEP being converted into carbons one, two, and three of OxA. The carbon chain #ABC joins with #a (CO 2 ), becoming carbon four of OxA ( Figure 2B). Co-substrates, such as ATP, phosphate, or H 2 O are not considered; hence, both PEP carboxylase (EC 4.1.1.31) and PEP carboxykinase (EC 4.1.1.32) would be encoded by the above equation. In addition to carbon transitions, we must decide if the reaction is a unidirectional-or bidirectionalone. Several plant models define the above reaction as unidirectional, assuming it to be PEP carboxylase, which reportedly is unidirectional. Such reactions with very large standard enthalpy can safely be assumed to be unidirectional in any organism under any condition. Yet, for reactions with smaller standard enthalpy a highly reliable definition of a reaction's directionality would require knowledge of organism-or tissue-specific in vivo concentrations of all enzyme substrates (Heinrich and Schuster, 1996).

ESSENTIAL COMPUTATIONAL ASPECTS OF 13 C-MFA
Based on the definition of all reactions in the network outlined above, the modeling framework 13CFLUX  automatically generates the necessary equation systems to simulate the distribution of the 13 C-label in the network. Labeling state variables can be encoded as the relative abundance of isotope isomers (isotopomers; Schmidt et al., 1997). Accordingly, in the above example, PEP would be represented by the fractional abundances of the eight isotopomer species #000, #100, #010, #001, #110, #011, #101, and #111 (with "1" denoting 13 C, and "0" denoting 12 C  Sriram et al., 2007), and elementary metabolic units (EMUs; Antoniewicz et al., 2007). The network examples in Table 1 range between about 2000-and 4000-simulated cumomer species, while the EMU approach reduces by several-fold the number of labeling state variables (Table 1); in future, this should support the simulation of much larger networks. The goal of the computational analysis is to determine the unknown fluxes that best explain the experimental data. While various studies can differ considerably in their details, a general outline is given here. To determine the fluxes in the system, we use a search algorithm (iterative least-squares fitting procedure). Starting from an initial guess, a set of flux values is sought wherein the fluxes and labeling signatures predicted by the model are the closest to the experimentally determined flux and labeling measurements. Various studies repeated this optimization process between about 200-and 1000-times (Allen et al., 2009b;Lonien and Schwender, 2009;Masakapalli et al., 2010) to assure an adequate exploration of the possible existence of alternative solutions. In addition, the search algorithm might converge repeatedly to flux values that represent only a local optimum. The more often the search is done, the more confident can the modeler be that the global optimum solution is uncovered. For the network definition process discussed below, it is important to note that the required computation time per optimization run increases with the network's size and complexity, as does the computational effort necessary to analyze a solution space of increasing complexity (alternative optima). For modeling four genotypes (model variants) of a larger network (Lonien and Schwender, 2009), access to cluster computing has proven indispensable.
Finally, an important part of computational analysis is to determine the statistical confidence in the obtained flux values (statistical analysis). Flux values that are not obtained consistently by multiple optimizations and have large statistical uncertainty are called "not resolved." Additionally, to assess how strong is the foundation of the flux results in the experimental data, we can determine in a sensitivity analysis how the values for optimal flux depend on the label measurements and other model parameters. By further considering the choice of substrate labeling, experiments can be optimized to resolve particular fluxes of interest (experimental design; Libourel et al., 2007).

NETWORK DEFINITION AND VALIDATION IN 13 C-MFA
The core of a formulated network in 13 C-MFA typically consists of the reactions in glycolysis, the pentose-phosphate pathway and Calvin cycle, the TCA cycle, and in anaplerosis. This formulation could be based on a consensus from the biochemical literature on the plant's central metabolic pathways. For example, while the presence of mitochondrial-and plastidic-isoforms of pyruvate dehydrogenase in higher plants is well established (Tovar-Méndez et al., 2003), including a cytosolic isoform in a model would be unrealistic unless there was good evidence from the particular plant being studied. Beyond a consensus, experimental evidence in the literature or databases about a particular plant species and cell type typically also is considered.
A key to resolving fluxes of pathways organized in parallel in different compartments is to obtain compartment-specific labeling information. For example, Val, Leu, and Ile are formed from pyruvate in the plastid, i.e., their carbon chains store label information of plastidic pyruvate (Singh and Shaner, 1995). Furthermore, protein-bound Asp, Ala, and Glu, respectively, are assumed to represent cytosolic oxaloacetate, pyruvate, and α-ketoglutarate, provided that the following two assumptions are valid: (1) Asp, Ala, and Glu in the cytosol are isotopically equilibrated with their respective corresponding α-ketocarboxylic acids due to the high activity of the reversible aminotransferases, and, (2) Most of the analyzed protein is synthesized from cytosolic amino acids, i.e., in the analyzed biomass only very small fractions of proteins originated from plastidic or mitochondrial protein synthesis.
In 13 C-MFA models, the intrinsic complexity of the metabolic network often is reduced extensively by lumping metabolic pools (van Winden et al., 2001b), as demonstrated for the highly connected sub-network of the TCA cycle in B. napus (Figure 2). Pools of OxA and malate (Mal), localized in the cytosol and mitochondria (Figure 2A), are lumped into one pool (Oxa/Mal, Figure 2B). This combination was justified mainly by observations made in labeling signatures in Asp, derived from storage protein (Schwender et al., 2006). Symmetries in the labeling pattern suggested that OxA, after its derivation from the carboxylation of PEP in the cytosol, undergoes a randomization, attributed to the symmetry in succinate (Succ) and fumarate (Fum; Figure 2B). Therefore, the equilibration of the carbon-labeling signatures of the C-4 carboxylic acids OxA, Mal, Succ, and Fum supposedly reflects the large fluxes that inter-convert those pools within cytosol and mitochondria, and across the mitochondrial membrane (Figure 2A). Therefore, for the 13 C-MFA model, the complexity of the C-4 carboxylic acids inter-conversions was reduced by defining two lumped pools, i.e., OxA/Mal and Succ/Fum, and condensing the various reversible inter-conversions (Figure 2A) into one reversible reaction (vFM, Figure 2B). The consequence of this network reduction is that the net and exchange flux of vFM can be determined with good precision, although the parallel reactions in the cytosol and mitochondria cannot be resolved.
Typically, the modeling process also considers whether adding or removing particular reactions in an existing model might generate a model that better fits the labeling measurements (Schwender et al., 2006;Williams et al., 2008;Lonien and Schwender, 2009;Masakapalli et al., 2010). For example, the isocitrate dehydrogenase reaction is often considered unidirectional from citrate to α-ketoglutarate. Yet, in Brassica napus (rapeseed) and soy embryos, the labeling pattern in citrate is explained only if the model also allows for conversion of α-ketoglutarate back to citrate (Schwender et al., 2006;Allen et al., 2009b). This finding showed that, in contrast to the common assumption in the literature, the isocitrate dehydrogenase reaction (Figure 2B) must be reversible in vivo. Other observations on labeling signatures in B. napus justified the assumption that the conversions of PEP to OxA, or PEP to Pyr are in vivo irreversible (Schwender et al., 2006).
In conclusion, the topology of published 13 C-MFA networks often reflects several assumptions and circumstantial experimental evidence used to justify using lumped networks. Often the underlying unreduced (large-scale) network, and the reduction process are not documented fully and transparently. Lumped metabolic models might depend in part on intuition, and only somewhat result from a transparent process to reduce network complexity.

Frontiers in Plant Science | Plant Physiology
Yet, in 13 C-MFA, the resulting values for flux and their interpretation critically depend upon the network's topology (van Winden et al., 2001a). In addition, once flux results are obtained, projecting the lumped metabolic models on to large-scale models involves substantial ambiguity. This means that mapping fluxes to pathways from pathway databases is problematic.
Generally, more organized, transparent, and reproducible workflows might improve model reconstruction; this is a major topic in other fields of biological computational research (Dalman et al., 2010;Goecks et al., 2010;Mesirov, 2010). With this in mind, we can employ some recently published genome-scale plant metabolic models used for FBA (Table 1; Poolman et al., 2009;de Oliveira Dal'Molin et al., 2010a,b;Williams et al., 2010;Saha et al., 2011) as a reference for a more unbiased and more clearly defined network reconstruction in 13 C-MFA. Yet, although the genomescale networks claim to be unbiased representations of the whole genome (Covert et al., 2001), they suffer from incompleteness and from the limited accuracy of gene annotation; certainly, for eukaryotes (plants) they reflect the very limited availability and reliability of predictions of the subcellular localization of the gene products (Poolman et al., 2006;Sweetlove et al., 2008). A particular problem arising in deriving compartmentalized networks is that many of the intracellular transporters functionally required remain unidentified and uncharacterized. Also, there is the ambiguous affinity of many of the known transporters to different substrates of similar structure (Linka and Weber, 2010). Furthermore, if a whole plant-genome is the template for network reconstruction, the result must be a generalized network rather than a network specific for a certain cell type. In addition, despite the recent comprehensive atom mapping of an E. coli genome-scale model (Ravikirthi et al., 2011), the carbon transitions in such large plant networks cannot yet be straightforwardly derived from databases.
Consequently, deriving reliable networks from plant-genome databases should require an enormous amount of manual curation. Alternatively, more useful may be the well-documented "bottom-up" reconstructions of large-scale plant models based on published biochemical-and tissue-specific-evidence (Table 1; Grafahrend-Belau et al., 2009;Hay and Schwender, 2011a,b). These models might be developed into large-scale 13 C-MFA models. While current 13 C-MFA models encompass between ∼50 and 100 reactions (Table 1), Suthers et al. (2007Suthers et al. ( , 2010) modeled a largescale E. coli network with 238 reactions. Recent advances in the mathematical formulation of isotope models, like the simulation of EMU support the representation of such networks with substantially less computation time than presently required (Antoniewicz et al., 2007). If large-scale plant 13 C-MFA models are to be simulated, certain aspects must be dealt with as detailed for the E. coli large-scale 13 C-MFA model (Suthers et al., 2007). No single optimal flux solution is obtained, and a complex analysis of the solution space is necessary, implying that, for many fluxes, a range of optimum values will be obtained rather than a discrete one. This problem can be attributed largely to parallel pathways that produce redundant labeling patterns and cannot be resolved. Some redundant solutions involving parallel pathways can contain substrate cycles that expectedly are of little biological relevance; thus Suthers et al. (2007) suggested a multi-step reduction in network size. They verified that each time metabolic pools are merged or a parallel pathway is removed, the model fit is not worsened, i.e., simplifying the model does not introduce bias. This kind of approach could replace the more intuitive "classical" model definition of lumped 13 C-MFA networks.
A further improvement of the definition of large-scale metabolic networks could lie in using quantitative analysis of the transcriptome by deep-sequencing technologies (RNA-seq; Wang et al., 2009). This technology requires having a genome sequence but should assure a more precise definition about which gene products are present in a particular cell type under specific conditions. The definition of central core metabolism would be improved, in particular since the subcellular localization of core metabolism enzymes can differ between cell types or species. For example, phosphoglyceromutase is only present in plastids of certain cell types (Stitt and ap Rees, 1979). The subcellular localization of ADP-Glucose Pyrophosphorylase differs between gramineous and non-gramineous species (Beckles et al., 2001).

CONCLUSION
In plant-specific 13 C-MFA studies published to date lumped network topologies are required. These networks represent a substantial simplification relative to the real complexity inherent to plant central metabolism. Often the validity of network simplifications has to be justified by vague assumptions or circumstantial experimental evidence. Constructing large-scale metabolic models can provide fully detailed networks, useful as a clearly defined reference point for deriving lumped 13 C models. Moreover, without lumping, 13 C-MFA with plant models of about 500 reactions in size should become computationally feasible, as indicated by recent microbial studies using large-scale 13 C-MFA (Suthers et al., 2007(Suthers et al., , 2010. The large-scale reference models also offer the potential to develop approaches that combine FBA with 13 C-MFA (Blank et al., 2005). Some explorations of the synergies between the two approaches were reported (Williams et al., 2010;Chen et al., 2011;Hay and Schwender, 2011b). With FBA, different physiological conditions can be simulated in silico to analyze situations in which steady-state 13 C-tracer experiments are impossible.
Another important goal in plant 13 C-MFA is to improve the precision of the flux estimates. This can be achieved by simulation of different experiments with differently 13 C-labeled tracers in one flux model (Schwender et al., 2006;Alonso et al., 2007bAlonso et al., , 2011Junker et al., 2007;Masakapalli et al., 2010).
Furthermore, analysis of how the distribution of cellular flux changes in response to targeted perturbations can help to unravel the kinetic-and regulatory-controls in metabolism (Lonien and Schwender, 2009). Such approaches should be particularly promising if for experimental systems that have been well established for 13 C-MFA, metabolomic, transcriptomic, and proteomic data are recorded in parallel.

ACKNOWLEDGMENTS
Current funding from the U.S. Department of Energy (Division of Chemical Sciences, Geosciences, and Biosciences, Office of Basic Energy Sciences, Field Work Proposal BO-133) as well as by Bayer Bioscience is much appreciated. I like to thank Avril Woodhead (Brookhaven National Laboratory) for English language edits. www.frontiersin.org