Computation of Single-Cell Metabolite Distributions Using Mixture Models

Tonn, Mona K.; Thomas, Philipp; Barahona, Mauricio; Oyarzún, Diego A.

doi:10.3389/fcell.2020.614832

ORIGINAL RESEARCH article

Front. Cell Dev. Biol., 22 December 2020

Sec. Genome Architecture and Epigenetic Memory

Volume 8 - 2020 | https://doi.org/10.3389/fcell.2020.614832

Computation of Single-Cell Metabolite Distributions Using Mixture Models

1. Department of Mathematics, Imperial College London, London, United Kingdom
2. School of Biological Sciences, University of Edinburgh, Edinburgh, United Kingdom
3. School of Informatics, University of Edinburgh, Edinburgh, United Kingdom

Abstract

Metabolic heterogeneity is widely recognized as the next challenge in our understanding of non-genetic variation. A growing body of evidence suggests that metabolic heterogeneity may result from the inherent stochasticity of intracellular events. However, metabolism has been traditionally viewed as a purely deterministic process, on the basis that highly abundant metabolites tend to filter out stochastic phenomena. Here we bridge this gap with a general method for prediction of metabolite distributions across single cells. By exploiting the separation of time scales between enzyme expression and enzyme kinetics, our method produces estimates for metabolite distributions without the lengthy stochastic simulations that would be typically required for large metabolic models. The metabolite distributions take the form of Gaussian mixture models that are directly computable from single-cell expression data and standard deterministic models for metabolic pathways. The proposed mixture models provide a systematic method to predict the impact of biochemical parameters on metabolite distributions. Our method lays the groundwork for identifying the molecular processes that shape metabolic heterogeneity and its functional implications in disease.

1. Introduction

Non-genetic heterogeneity is a hallmark of cell physiology. Isogenic cells can display markedly different phenotypes as a result of the stochasticity of intracellular processes and fluctuations in environmental conditions. Gene expression variability, in particular, has received substantial attention thanks to robust experimental techniques for measuring transcripts and proteins at a single-cell resolution (Golding et al., 2005; Taniguchi et al., 2010). This progress has gone hand-in-hand with a large body of theoretical work on stochastic models to identify the molecular processes that affect expression heterogeneity (Swain et al., 2002; Raj and van Oudenaarden, 2008; Thomas et al., 2014; Dattani and Barahona, 2017; Tonn et al., 2019).

In contrast to gene expression, our understanding of stochastic phenomena in metabolism is still in its infancy. Traditionally, cellular metabolism has been regarded as a deterministic process on the basis that metabolites appear in large numbers that filter out stochastic phenomena (Heinemann and Zenobi, 2011). But this view is changing rapidly thanks to a growing number of single-cell measurements of metabolites and co-factors (Bennett et al., 2009; Imamura et al., 2009; Lemke and Schultz, 2011; Paige et al., 2012; Ibáñez et al., 2013; Yaginuma et al., 2014; Esaki and Masujima, 2015; Xiao et al., 2016; Mannan et al., 2017) that suggest that cell-to-cell metabolite variation is much more pervasive than previously thought. The functional implications of this heterogeneity are largely unknown but likely to be substantial given the roles of metabolism in many cellular processes, including growth (Weisse et al., 2015), gene regulation (Lempp et al., 2019), epigenetic control (Loftus and Finlay, 2016), and immunity (Reid et al., 2017). For example, metabolic heterogeneity has been linked to bacterial persistence (Radzikowski et al., 2017; Shan et al., 2017), a dormant phenotype characterized by a low metabolic activity, as well as antibiotic resistance (Deris et al., 2013) and other functional effects (Vilhena et al., 2018). In biotechnology applications, metabolic heterogeneity is widely recognized as a limiting factor on metabolite production with genetically engineered microbes (Binder et al., 2017; Schmitz et al., 2017; Liu et al., 2018).

A key challenge for quantifying metabolic variability is the difficulty in measuring cellular metabolites at a single-cell resolution (Amantonico et al., 2010; Takhaveev and Heinemann, 2018; Wehrens et al., 2018). As a result, most studies use other phenotypes as a proxy for metabolic variation, e.g., enzyme expression levels (Kotte et al., 2014; van Heerden et al., 2014), metabolic fluxes (Schreiber et al., 2016), or growth rate (Kiviet et al., 2014; Şimşek and Kim, 2018). From a computational viewpoint, the key challenge is that metabolic processes operate on two timescales: a slow timescale for expression of metabolic enzymes, and a fast timescale for enzyme catalysis. Such multiscale structure results in stiff models that are infeasible to solve with standard algorithms for stochastic simulation (Gillespie, 2007). Other strategies to accelerate stochastic simulations, such as τ-leaping (Rathinam et al., 2003), also fail to produce accurate simulation results due to the disparity in molecule numbers between enzymes and metabolites (Tonn, 2020). These challenges have motivated a number of methods to optimize stochastic simulations of metabolism (Puchałka and Kierzek, 2004; Cao et al., 2005; Labhsetwar et al., 2013; Lugagne et al., 2013; Murabito et al., 2014). Most of these methods exploit the timescale separation to accelerate simulations at the expense of some approximation error. This progress has been accompanied by a number of theoretical results on the links between molecular processes and the shape of metabolite distributions (Levine and Hwa, 2007; Oyarzún et al., 2015; Gupta et al., 2017b; Tonn et al., 2019). Yet to date there are no general methods for computing metabolite distributions that can handle inherent features of metabolic pathways such as feedback regulation, complex stoichiometries, and the high number of molecular species involved.

In this paper we present a widely applicable method for approximating single-cell metabolite distributions. Our method is founded on the timescale separation between enzyme expression and enzyme catalysis, which we employ to approximate the stationary solution of the chemical master equation. The approximate solution takes the form of mixture distributions with: (i) mixture weights that can be computed from models for gene expression or single-cell expression data, and (ii) mixture components that are directly computable from deterministic pathway models. The resulting mixture model can be employed to explore the impact of biochemical parameters on metabolite variability. We illustrate the power of the method in two exemplar systems that are core building blocks of large metabolic networks. Our theory provides a quantitative basis to draw testable hypotheses on the sources of metabolite heterogeneity, which together with the ongoing efforts in single-cell metabolite measurements, will help to re-evaluate the role of metabolism as an active source of phenotypic variation.

2. General Method for Computing Metabolite Distributions

We consider metabolic pathways composed of enzymatic reactions interconnected by sharing of metabolites as substrates or products. In general, we consider models with M metabolites P_i with i ∈ {1, 2, …, M} and N catalytic enzymes E_j with j ∈ {1, 2, …, N}. A typical enzymatic reaction has the form

where P_i and P_k are metabolites, and E_j and C_j are the free and substrate-bound forms of the enzyme. The parameters (kf, j, kb, j) and (k_{cat, j}, k_{rev, j}) are positive rate constants specific to the enzyme. In contrast to traditional metabolic models, where the number of enzyme molecules is assumed constant, here we explicitly model enzyme expression and enzyme catalysis as stochastic processes. Our models also account for dilution of molecular species by cell growth and consumption of the metabolite products by downstream processes.

Though in principle one can readily write a Chemical Master Equation (CME) for the marginal distribution P(P₁, P₂, …P_M) given the pathway stoichiometry, analytical solutions of the CME are tractable only in few special cases. To overcome this challenge, we propose a method for approximating metabolite distributions that can be applied in a wide range of metabolic models. We first note that using the Law of Total Probability, the marginal distribution P(P₁, P₂, …, P_M) can be generally written as:

where P = (P₁, P₂, …P_M) and E = (E₁, E₂, …, E_N) are the vectors of metabolite and enzyme abundances, respectively. The equation in (2) describes the metabolite distribution in terms of fluctuations in gene expression, comprised in the distribution P(E), and fluctuations in reaction catalysis, described by conditional distribution P(P|E).

A key observation is that Equation (2) corresponds to a mixture model with weights P(E) and mixture components P(P|E). To compute the mixture weights and components, we make use of the timescale separation between gene expression and metabolism. Gene expression operates on a much slower timescale than catalysis (Cao et al., 2005; Levine and Hwa, 2007; Kuntz et al., 2013), with protein half-lives typically comparable to cell doubling times and catalysis operating in the millisecond to second range. Therefore, in the fast timescale of catalysis we can write a conservation law for the total amount of each enzyme (free and bound):

where E_{t, j} is the total number of enzymes E_j. Note that since our models integrate enzyme kinetics with enzyme expression, the variables E_{t, j} follow their own, independent stochastic dynamics. It is important to note that in our approach, the conservation relation in (3) holds only in the fast timescale of catalysis. This contrasts with classic deterministic models for metabolic reactions, which typically focus on the fast catalytic timescale and assume enzymes as constant model parameters (Cornish-Bowden, 2004).

As a result of the separation of timescales, the weights and components of the mixture in (2) can be computed separately. Specifically, the mixture weights P(E) can be obtained as solutions of a stochastic model for enzyme expression (Raj and van Oudenaarden, 2008), or taken from absolute single-cell measurements of enzyme expression. Such absolute measurements can be obtained from single-molecule technologies (Okumus et al., 2016), carefully calibrating fluorescence data (Rosenfeld et al., 2006; Bakker and Swain, 2019) or normalization (Taniguchi et al., 2010). The mixture components P(P|E), on the other hand, can be estimated with suitable approximation techniques. For simplicity, here we choose to employ the Linear Noise Approximation (LNA), which provides a Gaussian estimate of the stationary distribution of a stochastic chemical system (van Kampen, 1992; Elf and Ehrenberg, 2003). The use of the LNA is justified on the basis that metabolites tend to appear in large numbers per cell, a key condition for the LNA to produce accurate results. However, more accurate methods to compute P(P|E) can be used if required (Andreychenko et al., 2017; Gupta et al., 2017a). In Figure 1, we illustrate a schematic of the proposed method.

Figure 1

We thus propose the following procedure for computing single-cell metabolite distributions:

1. Starting from the mixture model in Equation (2), compute the enzyme distribution P(E) from a stochastic model for gene expression, either analytically (if possible) or numerically with Gillespie's algorithm.
2. To approximate the mixture components P(P|E) with the LNA, compute the steady state solution of the deterministic rate equation for each enzyme state E:

where S is the stoichiometric matrix and v(·) is the vector of deterministic reaction rates; for ease of notation we have assumed a unit cell volume, and hence the deterministic rates are equal to the propensities of the stochastic model. Note that due to the timescale separation, Equation (4) must be solved assuming constant enzymes E, and its solution depends on the enzyme abundance, i.e., .
3. For each enzyme state E, compute the solution to the Lyapunov equation (Elf and Ehrenberg, 2003):

where A is the Jacobian of (4) evaluated at the steady state and BB^T = Sdiag{v}S^T. Note that, as in (4), the solution of the Lyapunov equation depends on the enzyme state, i.e., Σ = Σ(E).
4. Following the LNA, approximate the mixture components P(P|E) as a multivariate Gaussian distribution with mean and covariance matrix Σ.
5. Combine the weights P(E) and Gaussian components P(P|E) through the mixture model in (2).

In the next sections we illustrate the effectiveness of our method in two exemplar systems.

3. Reversible Michaelis-Menten Reaction

We first consider a stochastic model that integrates a reversible Michaelis-Menten reaction with a standard model for enzyme expression. As shown in Figure 2A, the Michaelis-Menten mechanism includes reversible binding of four species: a metabolic substrate S, a free enzyme E, a substrate-enzyme complex C and a metabolic product P. To model enzyme expression, we use the well-known two-stage scheme for transcription and translation (Thattai and van Oudenaarden, 2001; Shahrezaei and Swain, 2008) (Figure 2A). The complete set of reactions is:

The reactions in (6) correspond to a reversible Michaelis-Menten reaction as in (1), while reactions in (7) are the two-stage model for gene expression. We include four additional first-order reactions (8) and (9) to model consumption of the metabolite product with rate constant kc, mRNA degradation with rate constant kdeg, and dilution of all model species with rate constant δ. In what follows we assume that the substrate S remains strictly constant, for example to model cases in which the substrate represents an extracellular carbon source that evolves in much slower timescale than cell doubling times.

Figure 2

Since on the fast timescale of the catalytic reaction, the total number of enzymes can be assumed in quasi-stationary state (Cornish-Bowden, 2004; Tonn et al., 2019), we have that

and therefore the general mixture model in (2) can be written as:

The mixture weights P(Etotal) can be computed from the stochastic model for gene expression in (7). Under the standard assumption that mRNAs are degraded much faster than proteins (Raj and van Oudenaarden, 2008), the stationary solution of the two-stage model can be approximated by a negative binomial distribution (Shahrezaei and Swain, 2008):

where Γ is the Gamma function and the parameters are defined as the burst frequency a = ktx/δ and burst size b = ktl/kdeg.

To compute the mixture components P(P|Etotal) with the LNA, we write the full system of deterministic rate equations [see (35) in section 6] for the three species E, C, and P. Note that in this case, we can further reduce the rate equations by (i) using the conservation law in (10), and (ii) assuming that the binding and unbinding reactions between S and E reach equilibrium faster than the product P, a condition that generally holds in metabolic reactions. After algebraic manipulations, the reduced ODE can be written as:

where

and the parameters are K_mS = (kb+kcat)/kf and K_mP = (kb+kcat)/krev.

The mean of each mixture component is simply given by the steady state solution of (13), which we denote as . For a given enzyme abundance Etotal, the variance Σ(Etotal) of each Gaussian component is given by the solution to the Lyapunov equation in (5):

where f′ and g′ are first-order derivatives. Combining the negative binomial in (12) with the Gaussian components, we can rewrite Equation (11) to get a Gaussian mixture model for the metabolite:

where both and Σ(x) must be computed for each value of x = Etotal in the summation. The normalization constant in (16) is

In Figure 3, we plot the mixture model (16) for realistic parameter values and compare this approximation with distributions computed from long runs of Gillespie simulations of the whole set of reactions (6)–(9). The results indicate that the mixture model provides an excellent approximation of the metabolite distribution. In the next section we test our methodology in a more complex pathway with feedback regulation.

Figure 3

Table 1

Figure 3
δ	0.00025s⁻¹	kb	1, 000 s⁻¹
a	{25, 50, 120}	kcat	3.6s⁻¹
b	1	krev	0.01s⁻¹
S	3, 000molecules	kc	0.02s⁻¹
kf	1 × Ss⁻¹

Parameter values for simulations in Figure 3.

4. Pathway With End-Product Inhibition

A common regulatory motif in metabolism is end-product inhibition, in which a pathway enzyme can bind to its own substrate as well as the pathway product (see Figure 2B). The product thus sequesters enzyme molecules, which reduces the number of free enzymes available for catalysis and slows done the reaction rate. To examine the accuracy of our method in this setting, we study a fully stochastic model for a two-step pathway with noncompetitive end-product inhibition:

The two reactions in (18) and (19) are reversible Michaelis-Menten kinetics, sharing the intermediate metabolite P1 as a product and substrate, respectively. The end-product inhibition in (20) consists of reversible binding between h molecules of P2 and the first enzyme E1 into a catalytically-inactive complex E*. The remaining model reactions in (21)–(25) are analogous to the previous example in section 3: reactions in (21) and (22) describe the two-stage model for expression of both enzymes, and with reactions (23)–(25) we model first-order mRNA degradation, product consumption, and dilution by cell growth. For simplicity we also assume that both enzymes are independently expressed, but in general our method can also account for cases in which enzymes are co-expressed or co-regulated (Chubukov et al., 2014). The resulting model has two distinct pools of enzymes, which remain constant over the timescale of catalysis:

and therefore the mixture model in (2) becomes

where the summation goes through all (Et, 1, Et, 2) pairs. Since both enzymes are expressed independently, the enzyme distribution is the product of two negative binomials P(Et, 1, Et, 2) = P(Et, 1) × P(Et, 2), each one analogous to the distribution in (12).

To compute the mixture components with the LNA, we use the rate equations for the reactions in (18)–(23); the full set of ODEs is listed in Equation (36) in the Methods. As in the first example, by employing the conservation laws in (26) and assuming rapid equilibrium of the complexes C1 and C2, the deterministic model can be further simplified to a 2-dimensional ODE:

where for ease of notation we have omitted the dependency on Et, 1 and Et, 2. The nonlinear functions in (28) are

where θ = ksq/krsq is the product-enzyme binding constant and the remaining parameters are defined as κS = kcat, 1kf, 1/(kb, 1+kcat, 1), κ1 = kb, 1krev, 1/(kb, 1+kcat, 1), κ2 = kcat, 2kf, 2/(kb, 2+kcat, 2), κ3 = kb, 2krev, 2/(kb, 2+kcat, 2), Km, S = kcat, 1/κS, Km, 1 = kb, 1/κ1, Km, 2 = kcat, 2/κ2, and Km, 3 = kb, 2/κ3.

As in the previous example, the ODEs in (28) correspond to the full model (36) rewritten in terms of both metabolites assuming that the enzyme-substrate reactions reach equilibrium in a faster timescale than catalysis. This reduced model can be readily employed to obtain approximations for the mixture components with the LNA. If we denote as the steady state solution of (28), we can write the Lyapunov equation as AΣ+ΣA^T+BB^T = 0 with A and BB^T given by

where f(·), g(·), and their derivatives are evaluated at the steady state solution . The Gaussian components of the mixture model are then

where and |·| is the matrix determinant. After combining the joint distribution of enzymes and the components into Equation (27), we get a Gaussian mixture model for the joint marginal distribution of both metabolites:

where and Σ(x, y) need to computed numerically for each pair (x, y) = (Et, 1, Et, 2) in the summation. The burst frequencies a_i = k_{tx, i}/δ and burst sizes b_i = k_{tl, i}/k_{deg, i} are specific to each enzyme, and the normalization constant is given by

To test the quality of the approximation, we numerically computed the mixture model in (33) for various combinations of parameter values, shown in Figure 4. We observe that the mixture model offers an excellent approximation as compared to exact Gillespie simulations of the full model (18)–(25). We note that in this case, the full stochastic model has seven species and three different timescales, and therefore the runtime of Gillespie simulations are extremely long, in the order of several hours per run.

Figure 4

Table 2

Figure 4				Figure 4A		Figure 4B
δ	0.00025s⁻¹	krev, 1	0.0001s⁻¹	a₁	{35, 126, 210}	a₁	80
kdeg, 1	0.2s⁻¹	kc, 1	0.00025s⁻¹	a₂	{35, 97, 97}	a₂	80
kdeg, 2	0.2s⁻¹	kf, 2	1.5s⁻¹	b₁	1	b₁	1
S	3, 000molecules	kb, 2	15, 000s⁻¹	b₂	1	b₂	1
kf, 1	20 × Ss⁻¹	kcat, 2	150s⁻¹	ksq	10⁻¹⁰s⁻¹	ksq	{0, 10⁻¹⁰, 10⁻¹²}s⁻¹
kb, 1	15, 000s⁻¹	krev, 2	0.001s⁻¹	krsq	1s⁻¹	krsq	1s⁻¹
kcat, 1	22.5s⁻¹	kc, 2	0.15s⁻¹	h	3	h	3

Parameter values for simulations in Figure 4.

To further illustrate the utility of our method, we employed the mixture model to study the impact of parameter perturbations on the metabolite distributions. Without an analytical solution, such a study would require the computation of long Gillespie simulations for each combination of parameter values, which quickly become infeasible due to the long simulation time. In contrast, the mixture model provides a systematic way to rapidly evaluate the influence of model parameters on metabolite distributions. In Figure 5A we show summary statistics of the marginal P(P₁) for various combinations of average enzyme expression levels. The results suggest that expression levels can have a strong impact on the mean and coefficient of variation of the intermediate metabolite. Moreover, in Figure 5B we plot the distribution P(P₁, P₂) for combinations of bursting parameters. The results show that uncorrelated enzyme fluctuations can result in correlated metabolite distributions due to the coupling introduced by the pathway (Levine and Hwa, 2007).

Figure 5

Table 3

Figure 5A		Figure 5B
a₁	[10, 100]	a₁	{10, 50, 50}
a₂	[10, 100]	a₂	{50, 50, 10}
b₁	1	b₁	{5, 1, 1}
b₂	1	b₂	{1, 1, 5}
ksq	10⁻¹⁰s⁻¹	ksq	0s⁻¹
krsq	1s⁻¹	krsq	1s⁻¹
h	3	h	3

Parameter values for simulations in Figure 5.

5. Discussion

Cellular metabolism has traditionally been assumed to follow deterministic dynamics. This paradigm results largely from the observation that cellular metabolites are highly abundant. However, recent data shows that single-cell metabolite distributions can display substantial heterogeneity in their abundance across single cells (Bennett et al., 2009; Imamura et al., 2009; Lemke and Schultz, 2011; Paige et al., 2012; Ibáñez et al., 2013; Yaginuma et al., 2014; Esaki and Masujima, 2015; Xiao et al., 2016; Mannan et al., 2017). It has also been shown that expression of metabolic genes is as variable as any other component of the proteome (Taniguchi et al., 2010), and thus in principle it is plausible that such enzyme fluctuations propagate to metabolites. These observations have begun to challenge the paradigm of metabolism being a deterministic process, suggesting that metabolite fluctuations may play a role in non-genetic heterogeneity.

Here we described a new computational tool to predict the statistics of metabolite fluctuations in conjunction with gene expression. The method is based on a timescale separation argument and leads to a Gaussian mixture model for the stationary distribution of cellular metabolites. Computing distributions from this approximate model is substantially faster than through stochastic simulations, as these can be extremely slow due to the multiple timescales of metabolic pathways. Our technique can therefore be employed to efficiently explore the parameter space and predict the shape of metabolite distributions in different conditions. In earlier work we showed that the product of a single metabolic reaction can be accurately described by a Poisson mixture model (Tonn et al., 2019). Such approximation allowed the discovery of previously unknown regimes for metabolite distributions, including heavily tailed distributions and various types of bimodality and multimodality. The Poisson approximation, however, is bespoke to single reactions and not valid for more complex systems. In contrast, the Gaussian mixture model discussed here can be applied to multiple kinetic mechanisms, more complex stoichiometries, as well as post-translational regulation.

An advantage of our approach is that the mixture weights can be computed offline from stochastic models for gene expression or single-cell expression data. The model is flexible in that it can readily accommodate gene expression models of various complexity. For the sake of illustration, in our examples we used the simple two-stage model for gene expression, but other models including gene regulation can also be employed (Dattani and Barahona, 2017). Particularly relevant models are those that account for enzyme co-regulation, a widespread feature of bacterial operons (Chubukov et al., 2014), which translates into correlations between expression of different pathway enzymes and the resulting metabolite abundances. A limitation of our method is that in many cases analytic solutions of the CME are not known, particularly for large models with multiple interacting genes. In such cases, the mixture weights P(E) can be approximated through stochastic simulations (Gillespie, 2007) albeit at the expense of increased computational costs. Most recently, progress in stochastic simulation of genome-scale metabolic networks (Tourigny et al., 2020) can offer an alternative route for studying fluctuations in large metabolic models.

The effectiveness of our method relies on two conditions: the separation of timescales between enzyme expression and enzyme catalysis, and the ability of the LNA to approximate the mixture components accurately. The first condition is satisfied by the vast majority of enzymes because their kinetics operate in regimes that are orders of magnitude faster than gene expression (Chubukov et al., 2014). However, the timescale separation can fail if the metabolic substrate S, typically a carbon source, cannot be assumed to be constant, a suitable assumption in the typical case of abundant nutrient sources with low fluctuations. Our theory would need to be extended in cases when nutrient sources become another source of variability, e.g., under fluctuations dictated by the environment (Dattani and Barahona, 2017). The second condition breaks down when the LNA fails to provide good estimates of the mixture components (Thomas and Grima, 2015; Andreychenko et al., 2017). As explained in section 2, here we have deliberately chosen to employ the LNA because it provides a simple and rapid method to compute the mixture components, P(P|E), for a broad range of metabolic pathways. Yet in cases where its assumptions do not hold, e.g., low abundance of metabolites, the LNA step in our method can be replaced by more accurate approximations. Such alternative methods include, for example, the conditional system size expansion including terms beyond the LNA, maximum entropy reconstructions using the method of conditional moments, or the finite state projection algorithm (Andreychenko et al., 2017; Gupta et al., 2017a), all of which can be readily incorporated into our mixture model strategy. These methods rely on different assumptions and their approximation quality will vary depending on the specific model parameters; in some cases, estimates for their approximation errors can be obtained with suitable methods, as discussed in a recent review on this topic (Kuntz et al., 2020).

Although our method can account for a large class of metabolic models and post-translational regulation mechanisms, there are a number of promising extensions that would broaden its utility in light of recent experimental advances. First, here we have only considered stationary distributions of metabolites, and a number of experiments have revealed cases in which metabolic heterogeneity emerges during dynamic nutrient shifts (Kotte et al., 2014; van Heerden et al., 2014; Nikolic et al., 2017). Extensions of our method to time-dependent metabolite distributions require the computation of the time-dependent solution of the CME for the enzyme expression model (Shahrezaei and Swain, 2008; Cao and Grima, 2018). As long as the dynamics of gene expression is slow enough to preserve the time scale separation, the computation of the mixture components with the LNA or other methods remains unchanged.

Another promising extension is the inclusion of transcriptional feedback regulation, a topic that has received substantial attention in the literature (Zaslaver et al., 2004; Chubukov et al., 2012; Chaves and Oyarzún, 2019; Lempp et al., 2019). In these systems, some pathway metabolites can bind to transcription factors (TF) that control enzyme expression in the same pathway. Such regulation can be included by using the conditional LNA method (Thomas and Grima, 2015) at the expense of not being able to compute the mixture weights offline anymore. Specifically, this extension would model mixture weights through more elaborate enzyme expression models in which the metabolite-TF interactions are replaced by their conditional averages, leading to an effective feedback model that requires specialized solution methods (Holehouse et al., 2020). A particularly promising application of such extended analysis is in synthetic biology, where there is a growing interest in the interplay between stochastic fluctuations and experimentally tunable parameters of molecular circuits (Briat et al., 2016; Boada et al., 2017). In particular, the use of metabolite-responsive feedback can improve robustness of strains engineered for the production of high-value metabolites (Oyarzún and Stan, 2013; Stevens and Carothers, 2015). Early results in this area (Oyarzún et al., 2015) suggest complex dependencies between metabolite fluctuations and the tunable parameters of the feedback control system. Such analyses were purely based on lumped models for metabolite-TF binding, and hence a more detailed theory could reveal novel design strategies to mitigate metabolite heterogeneity in production strains.

A number of works have sought to find links between fluctuations across layers of cellular organization, such as gene expression, metabolism and cell growth (Kiviet et al., 2014; Kotte et al., 2014; van Heerden et al., 2014; Nikolic et al., 2017; Thomas et al., 2018). But since measurement of metabolites in single cells remains technically challenging, there is pressing need for computational methods to predict fluctuations in cellular metabolites. Our proposed method provides a systematic approach for such task, paving the way for the generation of hypotheses on the molecular sources of metabolic heterogeneity.

6. Methods

6.1. Model Simulation

Stochastic simulations were computed with Gillespie's algorithm over long simulation times (several hours) corresponding to thousands of cell cycles. The ODE models and Lyapunov equations were solved in Matlab. In all examples, the negative binomial distribution for gene expression in (12) was computed with its continuum approximation (Gamma distribution).

6.2. Deterministic Rate Equations

6.2.1. Reversible Michaelis Menten

The full set of rate equations for the reversible reaction in (6)–(8) is:

To further reduce the above system of ODEs to Equation (13) in the main text, we can substitute the conservation relation in Equation (10), i.e. C = Etotal−E, and use the fact that the substrate-enzyme complex (C) typically equilibrates much faster than the product P, which means that dC/dt ≈ 0 in the timescale of catalysis.

6.2.2. End-Product Inhibition

The full set of rate equations for the reactions in (18)–(23) is:

As in the previous example, we can use the rapid equilibrium assumption and the conservation relations in (26), i.e., Et, 1 = E1+E*+C1 and Et, 2 = E2+C2, to simplify the 7-dimensional ODE in (28) to the 2-dimensional system in (28) of the main text.

Statements

Data availability statement

The original contributions presented in the study are included in the article/supplementary materials, further inquiries can be directed to the corresponding author/s.

Author contributions

MT carried out research, model simulation, model analysis, and wrote the paper. PT and MB contributed to model analysis and paper writing. DO designed the research, model analysis, and wrote the paper. All authors contributed to the article and approved the submitted version.

Funding

This work was funded by the Human Frontier Science Program through a Young Investigator Grant (RGY0076-2015) awarded to DO, a UKRI Future Leaders Fellowship (MR/T018429/1) awarded to PT, and the EPSRC Centre for Mathematics of Precision Healthcare (EP/N014529/1) awarded to MB.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

1
AmantonicoA.UrbanP. L.ZenobiR. (2010). Analytical techniques for single-cell metabolomics: state of the art and trends. Anal. Bioanal. Chem.398, 2493–2504. 10.1007/s00216-010-3850-1
- CrossRef
- Google Scholar
2
AndreychenkoA.BortolussiL.GrimaR.ThomasP.WolfV. (2017). Distribution Approximations for the Chemical Master Equation: Comparison of the Method of Moments and the System Size Expansion. Springer International Publishing, 39–66.
- Google Scholar
3
BakkerE.SwainP. S. (2019). Estimating numbers of intracellular molecules through analysing fluctuations in photobleaching. Sci. Rep.9, 1–13. 10.1038/s41598-019-50921-7
4
BennettB. D.KimballE. H.GaoM.OsterhoutR.Van DienS. J.RabinowitzJ. D. (2009). Absolute metabolite concentrations and implied enzyme active site occupancy in Escherichia coli. Nat. Chem. Biol.5, 593–599. 10.1038/nchembio.186
5
BinderD.DrepperT.JaegerK. E.DelvigneF.WiechertW.KohlheyerD.et al. (2017). Homogenizing bacterial cell factories: analysis and engineering of phenotypic heterogeneity. Metab. Eng.42, 145–156. 10.1016/j.ymben.2017.06.009
6
BoadaY.VignoniA.PicóJ. (2017). Engineered control of genetic variability reveals interplay among quorum sensing, feedback regulation, and biochemical noise. ACS Synth. Biol.6, 1903–1912. 10.1021/acssynbio.7b00087
7
BriatC.GuptaA.KhammashM. (2016). Antithetic integral feedback ensures robust perfect adaptation in noisy bimolecular networks. Cell Syst.2, 15–26. 10.1016/j.cels.2016.01.004
- CrossRef
- Google Scholar
8
CaoY.GillespieD. T.PetzoldL. R. (2005). Accelerated stochastic simulation of the stiff enzyme-substrate reaction. J. Chem. Phys.123:144917. 10.1063/1.2052596
9
CaoZ.GrimaR. (2018). Linear mapping approximation of gene regulatory networks with stochastic dynamics. Nat. Commun.9:3305. 10.1038/s41467-018-05822-0
10
ChavesM.OyarzúnD. A. (2019). Dynamics of complex feedback architectures in metabolic pathways. Automatica99, 323–332. 10.1016/j.automatica.2018.10.046
- CrossRef
- Google Scholar
11
ChubukovV.GerosaL.KochanowskiK.SauerU. (2014). Coordination of microbial metabolism. Nat. Rev. Microbiol.12, 327–340. 10.1038/nrmicro3238
12
ChubukovV.ZuletaI. A.LiH. (2012) Regulatory architecture determines optimal regulation of gene expression in metabolic pathways. Proc. Natl. Acad. Sci. U.S.A. 109, 5127–5132. 10.1073/pnas.1114235109
13
Cornish-BowdenA. (2004). Fundamentals of Enzyme Kinetics, 3rd Edn. London: Portland Press Ltd.
- Google Scholar
14
DattaniJ.BarahonaM. (2017). Stochastic models of gene transcription with upstream drives: exact solution and sample path characterization. J. R. Soc. Interf. 14:20160833. 10.1098/rsif.20016.833
15
DerisJ. B.KimM.ZhangZ.OkanoH.HermsenR.GroismanA.et al. (2013). The innate growth bistability and fitness landscapes of antibiotic resistant bacteria. Science342:1237435. 10.1126/science.1237435
16
ElfJ.EhrenbergM. (2003). Fast evaluation of fluctuations in biochemical networks with the linear noise approximation. Genome Res.13, 2475–2484. 10.1101/gr.1196503
17
EsakiT.MasujimaT. (2015). Fluorescence probing live single-cell mass spectrometry for direct analysis of organelle metabolism. Analyt. Sci. 31, 1211–1213. 10.2116/analsci.31.1211
18
GillespieD. T. (2007). Approximate accelerated stochastic simulation of chemically reacting systems. J. Chem. Phys.1716, 1716–1733. 10.1063/1.1378322
- CrossRef
- Google Scholar
19
GoldingI.PaulssonJ.ZawilskiS. M.CoxE. C. (2005). Real-time kinetics of gene activity in individual bacteria. Cell123, 1025–1036. 10.1016/j.cell.2005.09.031
20
GuptaA.MikelsonJ.KhammashM. (2017a). A finite state projection algorithm for the stationary solution of the chemical master equation. J. Chem. Phys.147:154101. 10.1063/1.5006484
21
GuptaA.Milias-ArgeitisA.KhammashM. (2017b). Dynamic disorder in simple enzymatic reactions induces stochastic amplification of substrate. J. R. Soc.14, 1–29. 10.1098/rsif.2017.0311
22
HeinemannM.ZenobiR. (2011). Single cell metabolomics. Curr. Opin. Biotechnol.22, 26–31. 10.1016/j.copbio.2010.09.008
23
HolehouseJ.CaoZ.GrimaR. (2020). Stochastic modeling of auto-regulatory genetic feedback loops: a review and comparative study. Biophys. J.118, 1517–1525. 10.1016/j.bpj.2020.02.016
24
IbáñezA. J.FagererS. R.SchmidtA. M.UrbanP. L.JefimovsK.GeigerP.et al. (2013). Mass spectrometry-based metabolomics of single yeast cells. Proc. Natl. Acad. Sci. U.S.A.110, 8790–8794. 10.1073/pnas.1209302110
25
ImamuraH.NhatK. P. H.TogawaH.SaitoK.IinoR.Kato-YamadaY.et al. (2009). Visualization of ATP levels inside single living cells with fluorescence resonance energy transfer-based genetically encoded indicators. Proc. Natl. Acad. Sci. U.S.A.106, 15651–15656. 10.1073/pnas.0904764106
26
KivietD. J.NgheP.WalkerN.BoulineauS.SunderlikovaV.TansS. J. (2014). Stochasticity of metabolism and growth at the single-cell level. Nature514, 376–379. 10.1038/nature13582
27
KotteO.VolkmerB.RadzikowskiJ. L.HeinemannM. (2014). Phenotypic bistability in Escherichia coli' s central carbon metabolism. Mol. Syst. Biol.10:736. 10.15252/msb.20135022
28
KuntzJ.OyarzúnD. A.StanG. B. V. (2013). “Model reduction of genetic-metabolic networks via time scale separation,” in A Systems Theoretic Approach to Systems and Synthetic Biology, eds V. Kulkarni, G.-B. Stan, and K. Raman (Springer), 181–210.
- Google Scholar
29
KuntzJ.ThomasP.StanG. B. V.BarahonaM. (2020). Stationary distributions of continuous-time Markov chains: a review of theory and truncation-based approximations. SIAM Rev.
- Google Scholar
30
LabhsetwarP.ColeJ. A.RobertsE.PriceN. D.Luthey-SchultenZ. A. (2013). Heterogeneity in protein expression induces metabolic variability in a modeled Escherichia coli population. Proc. Natl. Acad. Sci. U.S.A.110, 14006–14011. 10.1073/pnas.1222569110
31
LemkeE. A.SchultzC. (2011). Principles for designing fluorescent sensors and reporters. Nat. Chem. Biol.7, 480–483. 10.1038/nchembio.620
32
LemppM.FarkeN.KuntzM.FreibertS. A.LillR.LinkH. (2019). Systematic identification of metabolites controlling gene expression in E. coli. Nat. Commun.10:4463. 10.1038/s41467-019-12474-1
33
LevineE.HwaT. (2007). Stochastic fluctuations in metabolic pathways. Proc. Natl. Acad. Sci. U.S.A.104, 9224–9229. 10.1073/pnas.0610987104
34
LiuD.MannanA. A.HanY.OyarzúnD. A.ZhangF. (2018). Dynamic metabolic control: towards precision engineering of metabolism. J. Indus. Microbiol. Biotechnol.45, 535–543. 10.1007/s10295-018-2013-9
35
LoftusR. M.FinlayD. K. (2016). Immunometabolism: cellular metabolism turns immune regulator. J. Biol. Chem.291, 1–10. 10.1074/jbc.R115.693903
36
LugagneJ. B.OyarzúnD. A.StanG. B. (2013). “Stochastic simulation of enzymatic reactions under transcriptional feedback regulation,” in Proceeding of the European Control Conference (Zurich), 3646–3651.
- Pubmed Abstract
- Google Scholar
37
MannanA. A.LiuD.ZhangF.OyarzúnD. A. (2017). Fundamental design principles for transcription-factor-based metabolite biosensors. ACS Synth. Biol.6, 1851–1859. 10.1021/acssynbio.7b00172
38
MurabitoE.VermaM.BekkerM.BellomoD.WesterhoffH. V.TeusinkB.et al. (2014). Monte-Carlo modeling of the central carbon metabolism of lactococcus lactis: insights into metabolic regulation. PLoS ONE9:e106453. 10.1371/journal.pone.0106453
39
NikolicN.SchreiberF.CoA. D.KivietD. J.BergmillerT.LittmannS.et al. (2017). Cell-to-cell variation and specialization in sugar metabolism in clonal bacterial populations. PLoS Genet.13:e1007122. 10.1371/journal.pgen.1007122
40
OkumusB.LandgrafD.LaiG. C.BakhsiS.Arias-CastroJ. C.YildizS.et al. (2016). Mechanical slowing-down of cytoplasmic diffusion allows in vivo counting of proteins in individual cells. Nat. Commun.7, 1–11. 10.1038/ncomms12130
- CrossRef
- Google Scholar
41
OyarzúnD. A.LugagneJ. B.StanG. B. V. (2015). Noise propagation in synthetic gene circuits for metabolic control. ACS Synth. Biol.4, 116–125. 10.1021/sb400126a
42
OyarzúnD. A.StanG. B. V. (2013). Synthetic gene circuits for metabolic control: design trade-offs and constraints. J. R. Soc. Interface10:20120671. 10.1098/rsif.2012.0671
43
PaigeJ. S.Nguyen-DucT.SongW.JaffreyS. R. (2012). Fluorescence imaging of cellular metabolites with RNA. Science335:1194. 10.1126/science.1218298
44
PuchałkaJ.KierzekA. M. (2004). Bridging the gap between stochastic and deterministic regimes in the kinetic simulations of the biochemical reaction networks. Biophys. J.86, 1357–1372. 10.1016/S0006-3495(04)74207-1
45
RadzikowskiJ. L.SchramkeH.HeinemannM. (2017). Bacterial persistence from a system-level perspective. Curr. Opin. Biotechnol.46, 98–105. 10.1016/j.copbio.2017.02.012
46
RajA.van OudenaardenA. (2008). Nature, nurture, or chance: stochastic gene expression and its Consequences. Cell135, 216–226. 10.1016/j.cell.2008.09.050
- CrossRef
- Google Scholar
47
RathinamM.PetzoldL. R.CaoY.GillespieD. T. (2003). Stiffness in stochastic chemically reacting systems: the implicit tau-leaping method. J. Chem. Phys.119, 12784–12794. 10.1063/1.1627296
48
ReidM. A.DaiZ.LocasaleJ. W. (2017). The impact of cellular metabolism on chromatin dynamics and epigenetics. Nat. Cell Biol.19, 1298–1306. 10.1038/ncb3629
49
RosenfeldN.PerkinsT. J.AlonU.ElowitzM. B.SwainP. S. (2006). A fluctuation method to quantify in vivo fluorescence data. Biophys. J.91, 759–766. 10.1529/biophysj.105.073098
50
SchmitzA. C.HartlineC. J.ZhangF. (2017). Engineering microbial metabolite dynamics and heterogeneity. Biotechnol. J. 12:1700422. 10.1002/biot.201700422
51
SchreiberF.LittmannS.LavikG.EscrigS.MeibomA.KuypersM. M. M.et al. (2016). Phenotypic heterogeneity driven by nutrient limitation promotes growth in fluctuating environments. Nat. Microbiol. 1:16055. 10.1038/nmicrobiol.2016.55
52
ShahrezaeiV.SwainP. S. (2008). Analytical distributions for stochastic gene expression. Proc. Natl. Acad. Sci. U.S.A.105, 17256–17261. 10.1073/pnas.0803850105
53
ShanY.GandtA. B.RoweS. E.DeisingerJ. P.ConlonB. P.LewisK. (2017). ATP-dependent persister formation in Escherichia coli. mBIO8, 1–14. 10.1128/mBio.02267-16
54
ŞimşekE.KimM. (2018). The emergence of metabolic heterogeneity and diverse growth responses in isogenic bacterial cells. ISME J.12, 1199–1209. 10.1038/s41396-017-0036-2
55
StevensJ. T.CarothersJ. M. (2015). Designing RNA-based genetic control systems for efficient production from engineered metabolic pathways. ACS Synth. Biol.4, 107–115. 10.1021/sb400201u
56
SwainP. S.ElowitzM. B.SiggiaE. D. (2002). Intrinsic and extrinsic contributions to stochasticity in gene expression. Proc. Natl. Acad. Sci. U.S.A.99, 12795–12800. 10.1073/pnas.162041399
57
TakhaveevV.HeinemannM. (2018). Metabolic heterogeneity in clonal microbial populations. Curr. Opin. Microbiol.45, 30–38. 10.1016/j.mib.2018.02.004
58
TaniguchiY.ChoiP. J.LiG. W.ChenH.BabuM.HearnJ.et al. (2010). Quantifying E. coli proteome and transcriptome with single-molecule sensitivity in single cells. Science329, 533–538. 10.1126/science.1188308
59
ThattaiM.van OudenaardenA. (2001). Intrinsic noise in gene regulatory networks. Proc. Natl. Acad. Sci. U.S.A.98, 8614–8619. 10.1073/pnas.151588598
60
ThomasP.GrimaR. (2015). Approximate probability distributions of the master equation. Phys. Rev. E92:012120. 10.1103/PhysRevE.92.012120
61
ThomasP.PopovićN.GrimaR. (2014). Phenotypic switching in gene regulatory networks. Proc. Natl. Acad. Sci. U.S.A.111, 6994–6999. 10.1073/pnas.1400049111
62
ThomasP.TerradotG.DanosV.WeißeA. Y. (2018). Sources, propagation and consequences of stochasticity in cellular growth. Nat. Commun.9, 1–11. 10.1038/s41467-018-06912-9
63
TonnM. K. (2020). Stochastic modelling and analysis of metabolic heterogeneity in single cells (Ph.D. thesis). Imperial College London, London, United Kingdom.
- Google Scholar
64
TonnM. K.ThomasP.BarahonaM.OyarzúnD. A. (2019). Stochastic modelling reveals mechanisms of metabolic heterogeneity. Commun. Biol.2:108. 10.1038/s42003-019-0347-0
65
TourignyD.GoldbergA.KarrJ. (2020). Simulating single-cell metabolism using a stochastic flux-balance analysis algorithm. bioRxiv. 10.1101/2020.05.22.110577
- CrossRef
- Google Scholar
66
van HeerdenJ. H.WortelM. T.BruggemanF. J.HeijnenJ. J.BollenY. J. M.PlanquéR.et al. (2014). Lost in transition: start-up of glycolysis yields subpopulations of nongrowing cells. Science343:1245114. 10.1126/science.1245114
67
van KampenN. G. (1992). Stochastic Processes in Physics and Chemistry. Amsterdam: Elsevier.
- Pubmed Abstract
- Google Scholar
68
VilhenaC.KaganovitchE.ShinJ. Y.GrünbergerA.BehrS.KristoficovaI.et al. (2018). A single-cell view of the BtsSR/YpdAB pyruvate sensing network in Escherichia coli and its biological relevance. J. Bacteriol.200, 1–13. 10.1128/JB.00536-17
69
WehrensM.BükeF.NgheP.TansS. J. (2018). Stochasticity in cellular metabolism and growth: approaches and consequences. Curr. Opin. Syst. Biol.8, 131–136. 10.1016/j.coisb.2018.02.006
- CrossRef
- Google Scholar
70
WeißeA. Y.OyarzúnD. A.DanosV.SwainP. S. (2015). Mechanistic links between cellular trade-offs, gene expression, and growth. Proc. Natl. Acad. Sci. U.S.A.112, E1038–E1047. 10.1073/pnas.1416533112
71
XiaoY.BowenC. H.LiuD.ZhangF. (2016). Exploiting non-genetic, cell-to-cell variation for enhanced biosynthesis. Nat. Chem. Biol.12, 339–344. 10.1038/nchembio.2046
72
YaginumaH.KawaiS.TabataK. V.TomiyamaK.KakizukaA.KomatsuzakiT.et al. (2014). Diversity in ATP concentrations in a single bacterial cell population revealed by. Sci. Rep.4:6522. 10.1038/srep06522
73
ZaslaverA.MayoA. E.RosenbergR.BashkinP.SberroH.TsalyukM.et al. (2004). Just-in-time transcription program in metabolic pathways. Nat. Genet.36, 486–491. 10.1038/ng1348

Summary

Keywords

metabolic variability, stochastic gene expression, metabolic modeling, single-cell modeling, mixture model analysis

Citation

Tonn MK, Thomas P, Barahona M and Oyarzún DA (2020) Computation of Single-Cell Metabolite Distributions Using Mixture Models. Front. Cell Dev. Biol. 8:614832. doi: 10.3389/fcell.2020.614832

Received

07 October 2020

Accepted

26 November 2020

Published

22 December 2020

Volume

8 - 2020

Edited by

Ankur Sharma, Genome Institute of Singapore, Singapore

Reviewed by

Yogesh Goyal, University of Pennsylvania, United States; Alejandro Vignoni, Max Planck Institute of Molecular Cell Biology and Genetics, Germany

Updates

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Diego A. Oyarzún D.Oyarzun@ed.ac.uk

This article was submitted to Epigenomics and Epigenetics, a section of the journal Frontiers in Cell and Developmental Biology

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Genome Architecture and Epigenetic Memory

ORIGINAL RESEARCH article

Computation of Single-Cell Metabolite Distributions Using Mixture Models

Abstract

1. Introduction

2. General Method for Computing Metabolite Distributions

3. Reversible Michaelis-Menten Reaction

4. Pathway With End-Product Inhibition

5. Discussion

6. Methods

6.1. Model Simulation

6.2. Deterministic Rate Equations

6.2.1. Reversible Michaelis Menten

6.2.2. End-Product Inhibition

Statements

Data availability statement

Author contributions

Funding

Conflict of interest

References

Summary

Outline

Figures

Cite article

Article metrics

ORIGINAL RESEARCH article

Computation of Single-Cell Metabolite Distributions Using Mixture Models

Abstract

1. Introduction

2. General Method for Computing Metabolite Distributions

3. Reversible Michaelis-Menten Reaction

4. Pathway With End-Product Inhibition

5. Discussion

6. Methods

6.1. Model Simulation

6.2. Deterministic Rate Equations

6.2.1. Reversible Michaelis Menten

6.2.2. End-Product Inhibition

Statements

Data availability statement

Author contributions

Funding

Conflict of interest

References

Summary

Outline

Figures

Cite article

Share article

Article metrics