Computer-Aided Whole-Cell Design: Taking a Holistic Approach by Integrating Synthetic With Systems Biology

Marucci, Lucia; Barberis, Matteo; Karr, Jonathan; Ray, Oliver; Race, Paul R.; de Souza Andrade, Miguel; Grierson, Claire; Hoffmann, Stefan Andreas; Landon, Sophie; Rech, Elibio; Rees-Garbutt, Joshua; Seabrook, Richard; Shaw, William; Woods, Christopher

doi:10.3389/fbioe.2020.00942

PERSPECTIVE article

Front. Bioeng. Biotechnol., 07 August 2020

Sec. Synthetic Biology

Volume 8 - 2020 | https://doi.org/10.3389/fbioe.2020.00942

Computer-Aided Whole-Cell Design: Taking a Holistic Approach by Integrating Synthetic With Systems Biology

LM
Lucia Marucci ^1,2,3^*
MB
Matteo Barberis ^4,5,6^{† *}
JK
Jonathan Karr ⁷^{† *}
OR
Oliver Ray ⁸^{† *}
PR
Paul R. Race ^3,9^{† *}
MD
Miguel de Souza Andrade ^10,11
CG
Claire Grierson ^3,12^*
SA
Stefan Andreas Hoffmann ¹³
SL
Sophie Landon ^1,3
ER
Elibio Rech ¹⁰^*
JR
Joshua Rees-Garbutt ^3,12
RS
Richard Seabrook ¹⁴^*
WS
William Shaw ¹⁵
CW
Christopher Woods ^3,16^*

1. Department of Engineering Mathematics, University of Bristol, Bristol, United Kingdom
2. School of Cellular and Molecular Medicine, University of Bristol, Bristol, United Kingdom
3. Bristol Centre for Synthetic Biology (BrisSynBio), University of Bristol, Bristol, United Kingdom
4. Systems Biology, School of Biosciences and Medicine, Faculty of Health and Medical Sciences, University of Surrey, Guildford, United Kingdom
5. Centre for Mathematical and Computational Biology, CMCB, University of Surrey, Guildford, United Kingdom
6. Synthetic Systems Biology and Nuclear Organization, Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, Netherlands
7. Icahn Institute for Data Science and Genomic Technology, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
8. Department of Computer Science, University of Bristol, Bristol, United Kingdom
9. School of Biochemistry, University of Bristol, Bristol, United Kingdom
10. Brazilian Agricultural Research Corporation/National Institute of Science and Technology – Synthetic Biology, Brasília, Brazil
11. Department of Cell Biology, Institute of Biological Sciences, University of Brasília, Brasília, Brazil
12. School of Biological Sciences, University of Bristol, Bristol, United Kingdom
13. Manchester Institute of Biotechnology, The University of Manchester, Manchester, United Kingdom
14. Elizabeth Blackwell Institute for Health Research (EBI), University of Bristol, Bristol, United Kingdom
15. Department of Bioengineering, Imperial College London, London, United Kingdom
16. School of Chemistry, University of Bristol, Bristol, United Kingdom

Article metrics

View details

Citations

16,3k

Views

2,9k

Downloads

Abstract

Computer-aided design (CAD) for synthetic biology promises to accelerate the rational and robust engineering of biological systems. It requires both detailed and quantitative mathematical and experimental models of the processes to (re)design biology, and software and tools for genetic engineering and DNA assembly. Ultimately, the increased precision in the design phase will have a dramatic impact on the production of designer cells and organisms with bespoke functions and increased modularity. CAD strategies require quantitative models of cells that can capture multiscale processes and link genotypes to phenotypes. Here, we present a perspective on how whole-cell, multiscale models could transform design-build-test-learn cycles in synthetic biology. We show how these models could significantly aid in the design and learn phases while reducing experimental testing by presenting case studies spanning from genome minimization to cell-free systems. We also discuss several challenges for the realization of our vision. The possibility to describe and build whole-cells in silico offers an opportunity to develop increasingly automatized, precise and accessible CAD tools and strategies.

Introduction

Whole-cell models (WCMs) are state-of-the-art Systems Biology formalisms: they aim at representing and integrating all cellular functions within a unique computational framework, ultimately enabling a holistic, and quantitative understanding of cell biology (Tomita, 2001; Karr et al., 2015a). Quantitative and high-throughput in silico experiments generated from WCMs promise to significantly shorten the distance between hypothesis/design formulation and testing (Carrera and Covert, 2015).

While simplified models for specific cellular functions were first developed over 30 years ago [e.g., gene expression regulation (McAdams and Arkin, 1997), signaling (Morton-Firth and Bray, 1998) and metabolic pathways (Cornish-Bowden and Hofmeyr, 1991), cell growth (Shu and Shuler, 1989) and the cell cycle (Goldbeter, 1991; Tyson, 1991; Novak and Tyson, 1993)], the first WCM, the E-Cell model, was only derived in the 1990s for Mycoplasma genitalium, which has the smallest genome among freely living organisms (Tomita et al., 1999). The so-called virtual self-surviving cell (SSC) model is partially stochastic; it includes only a subset of protein-coding genes and enables dynamic simulations which encompass various subcellular processes, including enzymatic reactions, complex formation and substance translocation. In parallel, the first genome-scale metabolic models (GSMMs) were developed by Palsson’s group (Varma and Palsson, 1994) using flux balance analysis (FBA) in the 1990s.

More recently, hundreds of GSMMs have been reconstructed for different organisms, with an increasing number of represented genes (McCloskey et al., 2013; Yilmaz and Walhout, 2017; Mendoza et al., 2019). GSMMs have been complemented with a mathematical description of other processes, such as transcription, translation, and signaling (Lee et al., 2008; Thiele et al., 2009). Less than a decade ago a more complete, hybrid WCM, representing all genes and molecular functions known for an organism, was reported by Covert’s group (Karr et al., 2012). In this pioneering work, Karr and colleagues integrated 28 sub-models to represent one cell cycle of M. genitalium; each sub-model is represented with a distinct formalism, including ordinary differential equations (ODEs), FBA, stochastic simulations and Boolean rules.

Substantial research and effort are still needed to improve WCMs’ descriptive power and to increase the complexity of organisms they can represent. Developing a WCM is a challenging task, which requires the collection of extensive experimental data, integration of sub-cellular models and in silico/in vivo model validation. A complete WCM should ideally integrate multiscale interactions at the cellular level (Karr et al., 2012; King et al., 2016) while accounting for the overall cellular structure (Betts and Russell, 2007), the dynamic structure of molecular interactions (Noske et al., 2008; McGuffee and Elcock, 2010; Yu et al., 2016), and the spatial compartment of the subcellular components (Ander et al., 2004; Takahashi et al., 2005; Thul et al., 2017). Ensuring an accurate representation of all of the cellular processes across organisms of increasing complexity is highly challenging (Bouhaddou et al., 2018; Singla et al., 2018; Szigeti et al., 2018). It is therefore not surprising that, to date, only the M. genitalium and, very recently, E. Coli (Macklin et al., 2020). WCMs have been released, although several other WCMs are currently under development¹. We refer the reader to recent efforts which provide an overview of the state-of-the-art in the development of WCMs (Goldberg et al., 2018; Feig and Sugita, 2019).

Here, we focus on the enormous potential we believe WCMs have for design-build-test cycles integrating synthetic with systems biology (Figure 1). While the applications are diverse, they share a high degree of complexity which would require extensive trial and error experimental cycles in the absence of robust computational design algorithms based on predictive models. We conclude by considering relevant challenges that must be addressed by interdisciplinary communities to fully realize our vision, discussing future directions for integrating WCMs through synthetic and systems biology.

FIGURE 1

Whole-Cell Design Strategies in Synthetic Biology

Model Granularity of Gene Network (re)Design

Mathematical models can be instrumental for the (re)design of network circuits that recapitulate definite biological functions. Knowledge of regulatory mechanisms in biological pathways has been gained by considering living systems as a composition of functional modules, which are investigated through minimal computer models. Examples include controllable oscillators (Marucci et al., 2009; Purcell et al., 2010, 2013; Tomazou et al., 2018), circadian clocks (Gerard et al., 2009; Ananthasubramaniam et al., 2020), signaling networks (Prescott and Abel, 2017), the metabolism (Castellanos et al., 2004; Pandit et al., 2017), and transcriptional regulation (Carrera et al., 2009). Existing minimal and detailed computer models span a broad range of granularity in their biochemical details. However, one may expect that, if the core design of a minimal and a detailed model is similar, their general properties will match.

The understanding of a living organism at a system’s level may be reached through decomposing it into functional modules or modular circuits (Hartwell et al., 1999; Kitano, 2002; Ravasz et al., 2002). The capability to sustain viability through autonomously generated offspring is essential. It is therefore a feature that WCMs shall account for through modeling cell division, which is intimately integrated with various layers of cellular regulation (metabolism, signaling, gene regulation, transcription, etc.). A number of minimal models have been developed for the eukaryotic cell cycle by Barberis’, Tyson’s and Novák’s groups (Battogtokh and Tyson, 2004; Barberis et al., 2012; Gerard et al., 2013, 2015; Linke et al., 2017; Mondeel et al., 2020).

Currently, the majority of multiscale models (not WCMs) lack components able to bridge cellular networks or function (cell cycle, metabolism, signaling, gene regulation, etc.). Identification of hubs, i.e., elements with high connectivity in the cellular environment that integrate cellular networks, is a critical feature of WCMs. Transcription factors have recently been identified as hubs that integrate multiscale networks, potentially connecting the cell cycle to metabolism (Mondeel et al., 2019), and can be among the parts of a system that influence its state as a whole. Multiscale frameworks coupling networks of differing granularity are being developed, by identifying the relevant regulations occurring among common network nodes and through the use of different mathematical formalisms (van der Zee and Barberis, 2019). These and other strategies are also being developed to integrate networks of cellular functional modules (Prescott et al., 2015). Together with the identification of networks underlying the cell’s autonomous oscillations, these strategies can rationalize the proper timing of offspring generation accounted by WCMs.

Designing synthetic gene networks by modeling and integrating them within WCM formalisms [as in Purcell et al. (2013)] could be critical to investigate how gene expression correlates with codon usage, explore possible cell burden effects (Borkowski et al., 2016), and predict modularity of synthetic gene networks and tools to modulate gene expression across different chassis (Way et al., 2014; Pedone et al., 2019; Gomide et al., 2020).

Design and Engineering of Reduced Genomes

Minimal genomes can be defined as reduced genomes containing only the genetic material which is essential for a cell to reproduce (Glass et al., 2017). Studying and engineering minimal genomes can be instrumental both to understand the most essential tasks a cell must perform to sustain life, and to obtain optimal chassis for synthetic biology applications, with reduced cell burden and superior robustness (Moya et al., 2009; Hutchison et al., 2016; Ceroni and Ellis, 2018; Mol et al., 2018; Landon et al., 2019).

Exhaustive experimental characterization of a minimized genome is unfeasible: even for an organism as small as M. genitalium (0.58 mb and 525 genes), there are thousands of possible combinations of gene knockouts to be performed. Of note, this figure is most probably underestimated, accounting for the fact that the order in which gene deletions are performed can alter the resulting phenotypes (Gawand et al., 2015). Genome-scale computational models of cells could be instrumental to fully understand the dynamic and context-dependent nature of gene essentiality (Rancati et al., 2018), and to rationally design minimized genomes in silico. Computer-aided minimal genome engineering could significantly reduce the time and cost to reduce genomes compared to current approaches based on extensive experimental iterations (Posfai et al., 2006; Iwadate et al., 2011; Hirokawa et al., 2013; Hutchison et al., 2016; Zhou et al., 2016; Reuss et al., 2017; Breuer et al., 2019).

To the best of our knowledge, two top-down genome reduction approaches have been proposed so far based on genome-scale models. The MinGenome algorithm applies a mixed-integer linear programming (MILP) algorithm to a GSMM of Escherichia coli, using information pertaining to essential genes and synthetic lethal pairs within the optimization (Wang and Maranas, 2018). In contrast, Minesweeper and GAMA are top-down genome minimization algorithms based on the M. genitalium WCM. They exploit a divide-and-conquer approach and a biased genetic algorithm, respectively, to iteratively simulate reduced genomes (Rees-Garbutt et al., 2020); their in silico predictions have not been tested in the laboratory yet.

GSMM-based genome reduction algorithms such as MinGenome or analogous, adaptable metaheuristic techniques [e.g., (Burgard et al., 2003; Tang et al., 2015; Mutturi, 2017)] are currently more broadly applicable across organisms given the large availability of these formalisms. Still, as more WCMs become available, we expect WCM-based genome reduction algorithms to provide superior predictions of cellular processes and genetic interactions, thanks to their richness of multiscale cellular process representation.

Design and Prototyping of Cell-Free Systems

Cell-free transcription/translation systems, based on crude cellular extracts, are a valuable platform to address fundamental biological questions in a controllable and reproducible way. In recent years, the decrease of costs associated with this technology and significant improvements in synthesis yield capabilities (Calhoun and Swartz, 2005) have made cell-free systems increasingly popular in synthetic biology for the prototyping and testing of engineered biological parts (McCloskey et al., 2013; Reuss et al., 2017; Yilmaz and Walhout, 2017; Mendoza et al., 2019) and networks (Noireaux et al., 2003; Siegal-Gaskins et al., 2014; Takahashi et al., 2015). As the possible applications of cell-free systems grow [see (Silverman et al., 2020) for a recent review], mathematical models are being developed to quantitatively formalize how biological processes perform within cell-free platforms (Koch et al., 2018).

So far, deterministic models (ODEs, or constraint-based) have been proposed to describe specific processes within cell-free platforms such as transcription and translation (Karzbrun et al., 2011; Stogbauer et al., 2012; Siegal-Gaskins et al., 2014), resource competition (Underwood et al., 2005; Borkowski et al., 2018; Matsuura et al., 2018; Moore et al., 2018), and metabolism (Vilkhovoy et al., 2018). The integration of mathematical formalisms across scales for cell-free platforms, building toward WCMs, could be highly beneficial to both facilitate de novo design of circuits, and to quantitatively compare in vitro cell-free products with their in vivo counterparts.

Whole-Cell Biosensor Design and Testing

Biosensors are analytical devices which can convert a biochemical reaction into a measurable signal. The recognition unit in a biosensor can be composed of whole cells, nucleic acids, enzymes, proteins, antibodies or combinations thereof. Synthetic biology has significantly accelerated biosensor development; new generation whole-cell biosensors (i.e., sensors implemented throughout living cells) have been engineered, allowing, for example: arsenic detection (Diesel et al., 2009), detection of pollutants and antibiotics (van der Meer and Belkin, 2010), microbial detection in industrial settings (Lu et al., 2013) and in vivo diagnostic applications [e.g., detection of environmental signals in the gut (Kotula et al., 2014) and diagnosis of liver metastases (Danino et al., 2015); see (Slomovic et al., 2015) for an overview].

The application of WCMs to the design, prototyping and testing of whole-cell biosensors could suggest rational approaches to tune their sensitivity, stability, and dynamic range while facilitating the choice of the ideal chassis and, if needed, guide its re-engineering to optimize biosensor performance (Hicks et al., 2020). If WCMs become available for different chassis and entire organisms, they could also support the design of optimized targeted delivery of genetically encoded biosensors.

Industrial Implications of Whole-Cell Models

Although the intellectual merit of pursuing a computer-aided whole-cell design approach is unquestioned, it is clear that the success of this endeavor will ultimately be judged by its impact on science, medicine, and industry. The increasing drive of computer-aided designs (CADs) toward “green” chemistry approaches, allied to increases in gene synthesis speed and capability and associated cost reductions, are making biosynthesis an increasingly appealing route for the manufacture of high-value chemicals (El Karoui et al., 2019). This includes a plethora of opportunities across the pharmaceutical, agrochemical, commodity chemical, and materials sectors, amongst others.

A major challenge, however, remains the development of robust, scalable microbial chassis, whose metabolic processes can be predictably tuned for a desired outcome (Xu et al., 2020). Currently, chassis choice is largely restricted to a subset of genetically tractable microorganisms, whose physiology and performance during fermentation are well understood, and for whom effective molecular genetic tools required for their manipulation exist. Chassis optimization to date has relied exclusively on incremental, stepwise improvements in desired host strain characteristics, including growth rate, feedstock utilization, and product yield (Calero and Nikel, 2019). For these reasons, the process of chassis optimization remains prohibitively slow and expensive, accounting in part for the paucity of high-value small molecules that are currently manufactured using synthetic biology processes. Targeted manipulations often lead to unanticipated off-target effects, linked to the co-dependency of metabolic processes, which generally function in concert within interdependent cellular networks (Woolston et al., 2013): perturbations may compromise rather than enhance desirable characteristics, leading to undesired outcomes. Clearly, robust, predictable WCMs represent an attractive solution to the problem of chassis optimization, affording a catch-all tool that can be used to unpick dependencies and ensure that performance criteria can be met.

Additionally, the complexities associated with population heterogeneity during chassis fermentation must be resolved (Danchin, 2012). For fermentation-based industrial processes to be tractable, product yields must be sufficiently high to make biosynthesis financially viable. The emergence of “cheaters” or slow-growers within microbial populations should be tackled with tunable regulatory processes that operate throughout populations. The introduction of such characteristics is a major challenge to conventional chassis design approaches. WCM-driven approaches could more easily implement and test these processes.

Critical to the success of a computer-aided whole-cell design approach is the quality of the employed model (Fernandez-Castane et al., 2014). Microbial systems with small genomes represent a compelling entry point for study, with model development possibly being facilitated by ongoing studies focused on establishing the core constituents of a functional genome. These studies are in part driven by genome minimization experiments, which in turn can be used to further refine model performance. Importantly, fundamental gaps remain in our understanding of microbial metabolic processes, and this will unquestionably hinder progress (Price et al., 2018). However, the capacity of WCMs to predict previously unidentified metabolic dependencies should be viewed as an acid test of model validity. Indeed, GSMMs often fail due to their inability to account for metabolic dependencies, a feature which has led to skepticism within industrial circles, questioning the value of such models. Whole-cell approaches offer a mechanism to circumvent this issue. This is of particular significance when developing chassis for “non-natural” products whose chemistries sit outside those of metabolites found in nature (Calero and Nikel, 2019). Expanding the metabolic capacity of chassis organisms to deliver such novel products risks introducing additional complexities, including excessive depletion of core metabolite pools or the generation of toxic products or intermediates. Design approaches driven by WCMs are uniquely placed to identify such issues and provide a route to their circumvention.

The capacity to design-in explicit control over cellular behavior is also critical for industrial adoption of model-derived chassis. It can be argued that the ability to regulate cellular processes is as important as defining the processes themselves. Tunable regulatory systems must afford a degree of both intrinsic and extrinsic control. Synthetic biology-based approaches for constructing genetic circuitry are now placing us on a path to broad-reaching cellular regulation, though issues still exist. These systems are often insufficiently orthogonal, with bespoke designs required for different chassis due to variations in core metabolic process (Pandit et al., 2017). Again, whole-cell design approaches offer a solution to this issue, as such systems can be predefined and tested for functionality in silico prior to undertaking costly lab experimentation.

What’s Next? Going Beyond the Prototype

In recent years, advances in genomic measurement technologies for data generation, the establishment of data repositories, and the development of WCM simulation platforms have significantly facilitated the derivation of WCMs [see (Goldberg et al., 2018) for a review]. Nevertheless, the implementation of WCM-based design-build-test cycles for genome-scale engineering requires further challenges to be addressed (Bartley et al., 2020).

If a model has to be used for the design and prototyping of an engineered living system, the model needs to be reliable. Even for a simple organism, the number of kinetic parameters raises as the complexity and the level of detail of a mathematical model increase; constraining parameters thus becomes harder and requires extensive experimental data. Mathematical models can be used to produce predictions of missing data, however, they often abstract physical processes using simplifying assumptions which might hold in specific conditions (Babtie and Stumpf, 2017). To set the 1,462 quantitative parameters of the M. genitalium WCM, values from related organisms were incorporated due to a lack of organism-specific data (Macklin et al., 2014); a combination of parameter values reported from previous experiments and numerical optimization on a reduced model was performed. While, ideally, we would like to measure all kinetic parameters directly from experiments, we still lack the ability to measure each state in individual cells over time, and across all possible environmental conditions. A combination of direct experimental estimation and parameter inference will likely be needed for genome-scale formalisms and WCMs.

Sensitivity analysis, usually performed by perturbing parameters to understand how uncertainties affect the model outputs (Erguler and Stumpf, 2011), can become extremely computationally expensive when applied to genome-scale models. Alternatively, statistical approaches such as those based on Bayesian methods (Vernon et al., 2018) or the Fisher information matrix (Rand, 2008) could be carefully carried out at least at the sub-model level, and possibly scaled up to WCMs. The Reverse Engineering Assessments and Methods (DREAM8) parameter estimation challenge (Karr et al., 2015b) was organized to develop new parameter estimation techniques specific for WCMs. It suggested possible interesting avenues for WCM parameterization (i.e., model reduction and a combination of differential evolution and random forests), and highlighted that the availability of comprehensive data is critical to ensure the model is practically identifiable (Ashyraliyev et al., 2009), and to calibrate WCMs.

Researchers have started to collect data needed for WCM development into public repositories [e.g., (Wittig et al., 2012; Kolesnikov et al., 2015; Sajed et al., 2016; UniProt Consortium, 2018; Caspi et al., 2020)]; still, the data needed to derive and fit WCMs are dispersed across many repositories and publications and often not annotated or normalized, ultimately requiring a massive manual effort. Federated archives of repositories, such as the PDB-Dev system to deposit Integrative/Hybrid models and corresponding data (Burley et al., 2017), also exist and might be well placed to archive and disseminate both data and models, while enabling different researchers to attempt alternative modeling/parameterization approaches. Covert’s group developed the WholeCellKB database (Karr et al., 2013) to organize the quantitative measurements (over 1,400) from which the M. genitalium WCM was derived; it would be ideal to enable automatic access and querying in such databases.

To enhance WCM reproducibility and collaboration, new standards and simulations software are also needed (Medley et al., 2016). Researchers should invest efforts to use and expand the capabilities of standard formats such as the Systems Biology Markup Language (SBML) (Hucka et al., 2003) and the Systems Biology Graphical Notation (SBGN) (Le Novere et al., 2009) to be suitable for WCMs. For example, several aspects of the M. genitalium WCM cannot be represented by SBML, such as the multi-algorithmic nature of the model (Waltemath et al., 2016). Further development of standard modeling formats is needed to enable reproducible WCM simulations, e.g., by including in the SMBL Hierarchical Model Composition package ontologies which could represent the algorithm needed for specific sub-models (Courtot et al., 2011). In the context of synthetic biology applications, we believe it would be appropriate and beneficial to report and deposit data related to various iterations of WCM-generated in silico predictions, in vivo testing and possible model/design refinement; this would establish the predictive power of WCMs and illuminate steps to make design-build-test-learn cycles more effective.

It is also important to consider the structural uncertainties in the model, which depend on model assumptions. While, for certain sets of models (e.g., small ODE systems for signaling pathways), likelihood- and Bayesian-based approaches have been proposed for model selection (Wilkinson, 2007; Kirk et al., 2013) and semidefinite programming for model invalidation (Anderson and Papachristodoulou, 2009), no suitable techniques for WCMs have been proposed to date.

We foresee that automation will play a fundamental role in the derivation of WCMs for eukaryotic organisms and in their application to design complex processes. Ideally, we would like to introduce automation at different stages, such as data extraction from the literature, model derivation, and model/data integration both within the model fitting and validation steps, and when comparing in silico design prediction with in vivo tests (Bartley et al., 2020). This, in turn, will require the adoption of standards for both data and model repositories. Also, laboratory automation, coupled to WCM-based CAD, is expected to transform design-build-test cycles. As the use of robotics becomes increasingly common in both academia and industry, the throughput and reproducibility of experiments needed for both WCM derivation and validation can be significantly increased, and protocol sharing across research communities facilitated (Jessop-Fabre and Sonnenschein, 2019).

To assist the adoption of WCMs for synthetic biology applications, high-performance parallelized computer clusters are required to run the models with lengthy runtimes, coordinate the corresponding databases, parameterize and validate the models, and then to integrate WCMs in design cycles in combination with optimization algorithms (Macklin et al., 2014; Chalkley et al., 2019).

The implementation of standardized tools to share data and simulate WCMs would, in turn, facilitate model validation. This should involve the definition of proper metrics and formal model verification techniques such as those developed for SBML-encoded models (Kwiatkowska et al., 2011).

(re)Thinking System Approaches: A Collaborative Effort

In addressing the aforementioned challenges, we believe there is a tremendous opportunity to rethink approaches used so far to generate genome-scale models, including WCMs, and to integrate with broader communities including software engineers, computer scientists, structural biologists, bioinformaticians, and systems and synthetic biologists.

We do anticipate that, as diverse communities synergize on WCM-related research, different kinds of formalisms might be integrated within genome-scale models. Symbolic reasoning provides a range of expressive and intuitive logical frameworks that could potentially complement and help glue together sub-models at different scales. Such methods are routinely applied to complex systems in the electronics and software industries, and have been applied to biological systems for nearly a decade (Iyengar, 2011). Recent work showed the feasibility of applying logic programming methods to signaling pathways (Ray et al., 2011), metabolic networks (Bragagli and Ray, 2015) and automating a mechanistic philosophy of scientific discovery in simulated organisms (Rozanski et al., 2015); it should be feasible to integrate such sub-models within a WCM framework.

We believe there is scope to further increase the descriptive and predictive ability of WCMs across spatial and temporal scales by integrating the structural biology and the molecular modeling communities to carefully consider not only the biochemical, but also the physical, molecular and structural components of cells. The development of the so-called “physical” WCMs [see (Feig and Sugita, 2019) and (Feig and Sugita, 2013) for comprehensive reviews] is an emerging field, with the first models describing minimal cellular environments in full atomistic detail (Feig et al., 2015; Yu et al., 2016). With the final aim to integrate biochemical and physical WCMs within a multiscale framework (Sali et al., 2015), we need approaches which can cope with the limitations of atomistic models of biomolecules (mainly in terms of computational resources), possibly exploiting coarse-grained (Ando and Skolnick, 2010; Hyeon and Thirumalai, 2011) or continuum (Solernou et al., 2018) approaches.

By collaborating with software engineers, we need to develop tools which can enable, and possibly automate, the integration of different data types across scales, model derivation, fitting and validation, and visualization and interpretation of results (Szigeti et al., 2018).

Moreover, rule-based models might become the new standard to represent each molecular species with the required level of granularity and multi-algorithmic sub-models (e.g., FBA and stochastic dynamical models). Frameworks where intuitive logic is coupled to rule-based models have started to be developed recently (van der Zee and Barberis, 2019).

As we produce ever-increasing amounts of experimental data and increasingly sophisticated computational tools to realize detailed and complex representations of actual cells, approaches instead focusing on deliberately abstract and parsimonious simulations of artificial cellular systems provide a valuable change of perspective. Such “toy models” might be a valuable tool to test different algorithms for model derivation and fitting, while offering an opportunity to engage with broader research communities and with the public (Castiglione et al., 2014).

Finally, we believe there is tremendous potential for applying machine learning techniques to both WCM derivation and their applications in synthetic biology. Two recent works (Lin et al., 2017; Ma et al., 2018) showed that deep neural networks are well placed to reconstruct the architecture of living systems [namely, the hierarchical organization of nuclear transcriptional factors in the nucleus (Lin et al., 2017) and of a basic eukaryotic cell (Ma et al., 2018)] and predict cell states and phenotypes. In both cases, the configuration of network layers and thus the biological structure were formulated using extensive prior knowledge, ultimately enabling fully “visible” systems, where all the internal biological states can be interrogated mechanistically (Yu et al., 2018). Machine learning could be beneficial to systematically process large in vivo and in silico whole-cell data-sets, for example by applying Bayesian inference, to integrate data from diverse sources and supplement sparse data (Perdikaris and Karniadakis, 2016), and to help to automatically classify WCM simulations and link phenotypes to genotypes (Alber et al., 2019). Ensemble methods, which combine multiple independent models into a single predictive model for increasing the overall robustness of predictions, might also be adopted to develop subcellular formalisms and support their integration across chassis (Camacho et al., 2018). Additionally, machine learning might assist in WCM parameter identification, for example applying Bayesian parameter estimation (Vyshemirsky and Girolami, 2008), regression models and reinforcement learning techniques (Alber et al., 2019). Optimal experimental design techniques might also offer a valuable methodology to select the best experimental datasets for both model identification and validation (Smucker et al., 2018).

Discussion

We have shown that WCMs are likely to be instrumental to inform design-build-test cycles across synthetic biology applications. WCMs can accelerate the realization of “designer” cells and organisms tailored to specific functions, reducing experimental iterations and increasing the predictive power of computational formalisms used so far.

In the (re)design of cellular network functionalities, it is therefore important to quantitatively analyze and predict, through dedicated modeling strategies, the dynamics of interactions between various layers of cellular regulation. Thus, WCMs should take into account how different cellular layers are integrated, and how regulatory feedback among these layers occurs in time. These challenges must be tackled through integrative computational and experimental collaborative efforts aimed, respectively, toward: (i) engineering in vivo network designs which, through predictive systems biology, may be able to autonomously oscillate, sustaining generation of offspring, and (ii) extraction, visualization and functional exploration of regulatory interactions among cellular layers through novel multiscale modeling frameworks.

As synthetic biology moves toward the (re)engineering of entire genomes and multicellular systems, interdisciplinary communities need to collaborate for the development of tools that are required to improve the predictive power of WCMs. Although challenges remain, it is clear that the adoption of model-based methods has the potential to transform both basic research and the current bioproduction development process, leading to marked improvements in host performance and product yield on an industrial scale.

Ultimately, as the development of human genome-scale kinetic models becomes more feasible (Bordbar et al., 2015; Szigeti et al., 2018), we anticipate that whole-cell formalisms will become an indispensable tool to study human variation, and design treatments and synthetic cellular screening systems.

Statements

Author contributions

LM, MB, JK, OR, and PR wrote the manuscript. MS prepared the figure. All other authors participated to discussion within the workshop, helped with editing, and/or provided feedback.

Funding

LM was funded by the Engineering and Physical Sciences Research Council (EPSRC, grants EP/R041695/1 and EP/S01876X/1) and Horizon 2020 (CosyBio, grant agreement 766840); MB was funded by the Systems Biology Grant of the University of Surrey; JK was funded by the National Institutes of Health (award R35GM119771); PR was funded by the EPSRC (EP/R020957/1) and the Biotechnology and Biological Sciences Research Council (BBSRC, BB/T001968/1); CG, LM, and PR were funded by the BrisSynBio, a BBSRC/EPSRC Synthetic Biology Research Centre (BB/L01386X/); SL and JR-G were funded by the EPSRC Future Opportunity Ph.D. scholarships; ER was funded by the INCT BioSyn (National Institute of Science and Technology in Synthetic Biology), CNPq (National Council for Scientific and Technological Development), CAPES (Coordination for the Improvement of Higher Education Personnel), Brazilian Ministry of Health, and FAPDF (Research Support Foundation of the Federal District), Brazil.

Acknowledgments

This work captures discussions between participants at the “Computer-Aided Whole-Cell Design and Engineering” Workshop held on the 02-03 July 2019 at the University of Bristol, United Kingdom, and funded by the Engineering and Physical Sciences Research Council (EPSRC) within the remits of the Big Ideas initiative. We sincerely thank Dr. Kathleen Sedgley for her support with the workshop organization, and Dr. Thomas Gorochowski for participating in discussions.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Footnotes

1.^https://www.wholecell.org/

References

1
AlberM.Buganza TepoleA.CannonW. R.DeS.Dura-BernalS.GarikipatiK.et al (2019). Integrating machine learning and multiscale modeling-perspectives, challenges, and opportunities in the biological, biomedical, and behavioral sciences.NPJ Digit. Med.2:115.
- Google Scholar
2
AnanthasubramaniamB.SchmalC.HerzelH. (2020). Amplitude effects allow short jet lags and large seasonal phase shifts in minimal clock models.J. Mol. Biol.4323722–3737. 10.1016/j.jmb.2020.01.014
3
AnderM.BeltraoP.Di VenturaB.Ferkinghoff-BorgJ.FoglieriniM.KaplanA.et al (2004). SmartCell, a framework to simulate cellular processes that combines stochastic approximation with diffusion and localisation: analysis of simple networks.Syst. Biol.1129–138. 10.1049/sb:20045017
- CrossRef
- Google Scholar
4
AndersonJ.PapachristodoulouA. (2009). On validation and invalidation of biological models.BMC Bioinform.10:132. 10.1186/s12918-017-0484-132
- CrossRef
- Google Scholar
5
AndoT.SkolnickJ. (2010). Crowding and hydrodynamic interactions likely dominate in vivo macromolecular motion.Proc. Natl. Acad. Sci. U.S.A.10718457–18462. 10.1073/pnas.1011354107
6
AshyraliyevM.Fomekong-NanfackY.KaandorpJ. A.BlomJ. G. (2009). Systems biology: parameter estimation for biochemical models.FEBS J.276886–902. 10.1111/j.1742-4658.2008.06844.x
7
BabtieA. C.StumpfM. P. H. (2017). How to deal with parameters for whole-cell modelling.J. R. Soc. Interf.14:237.
- Google Scholar
8
BarberisM.LinkeC.AdroverM. A.Gonzalez-NovoA.LehrachH.KrobitschS.et al (2012). Sic1 plays a role in timing and oscillatory behaviour of B-type cyclins.Biotechnol. Adv.30108–130. 10.1016/j.biotechadv.2011.09.004
9
BartleyB. A.BealJ.KarrJ. R.StrychalskiE. A. (2020). Organizing genome engineering for the gigabase scale.Nat. Commun.11:689.
- Google Scholar
10
BattogtokhD.TysonJ. J. (2004). Bifurcation analysis of a model of the budding yeast cell cycle.Chaos14653–661. 10.1063/1.1780011
- CrossRef
- Google Scholar
11
BettsM. J.RussellR. B. (2007). The hard cell: from proteomics to a whole cell model.FEBS Lett.5812870–2876. 10.1016/j.febslet.2007.05.062
12
BordbarA.McCloskeyD.ZielinskiD. C.SonnenscheinN.JamshidiN.PalssonB. O. (2015). Personalized whole-cell kinetic models of metabolism for discovery in genomics and pharmacodynamics.Cell Syst.1283–292. 10.1016/j.cels.2015.10.003
13
BorkowskiO.BricioC.MurgianoM.Rothschild-MancinelliB.StanG. B.EllisT. (2018). Cell-free prediction of protein expression costs for growing cells.Nat. Commun.9:1457.
- Google Scholar
14
BorkowskiO.CeroniF.StanG. B.EllisT. (2016). Overloaded and stressed: whole-cell considerations for bacterial synthetic biology.Curr. Opin. Microbiol.33123–130. 10.1016/j.mib.2016.07.009
15
BouhaddouM.BarretteA. M.SternA. D.KochR. J.DiStefanoM. S.RieselE. A.et al (2018). A mechanistic pan-cancer pathway model informed by multi-omics data interprets stochastic cell fate responses to drugs and mitogens.PLoS Comput. Biol.14:e1005985. 10.1371/journal.pcbi.1005985
16
BragagliS.RayO. (2015). “Nonmonotonic learning in large biological networks,” in Inductive Logic Programming. Lecture Notes in Computer Science, Vol. 9046edsDavisJ.RamonJ. (Cham: Springer).
- Google Scholar
17
BreuerM.EarnestT. M.MerrymanC.WiseK. S.SunL.LynottM. R.et al (2019). Essential metabolism for a minimal cell.eLife8:e36842.
- Google Scholar
18
BurgardA. P.PharkyaP.MaranasC. D. (2003). Optknock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization.Biotechnol. Bioeng.84647–657. 10.1002/bit.10803
19
BurleyS. K.KurisuG.MarkleyJ. L.NakamuraH.VelankarS.BermanH. M.et al (2017). PDB-Dev: a prototype system for depositing integrative/hybrid structural models.Structure251317–1318. 10.1016/j.str.2017.08.001
20
CaleroP.NikelP. I. (2019). Chasing bacterial chassis for metabolic engineering: a perspective review from classical to non-traditional microorganisms.Microb. Biotechnol.1298–124. 10.1111/1751-7915.13292
21
CalhounK. A.SwartzJ. R. (2005). Energizing cell-free protein synthesis with glucose metabolism.Biotechnol. Bioeng.90606–613. 10.1002/bit.20449
22
CamachoD. M.CollinsK. M.PowersR. K.CostelloJ. C.CollinsJ. J. (2018). Next-Generation machine learning for biological networks.Cell1731581–1592. 10.1016/j.cell.2018.05.015
23
CarreraJ.CovertM. W. (2015). Why build whole-cell models?Trends Cell Biol.25719–722. 10.1016/j.tcb.2015.09.004
24
CarreraJ.RodrigoG.JaramilloA. (2009). Model-based redesign of global transcription regulation.Nucleic Acids Res.37:e38. 10.1093/nar/gkp022
25
CaspiR.BillingtonR.KeselerI. M.KothariA.KrummenackerM.MidfordP. E.et al (2020). The MetaCyc database of metabolic pathways and enzymes - a 2019 update.Nucleic Acids Res.48D445–D453.
- Google Scholar
26
CastellanosM.WilsonD. B.ShulerM. L. (2004). A modular minimal cell model: purine and pyrimidine transport and metabolism.Proc. Natl. Acad. Sci. U.S.A.1016681–6686. 10.1073/pnas.0400962101
27
CastiglioneF.PappalardoF.BiancaC.RussoG.MottaS. (2014). Modeling biology spanning different scales: an open challenge.Biomed. Res. Int.2014:902545.
- Google Scholar
28
CeroniF.EllisT. (2018). The challenges facing synthetic biology in eukaryotes.Nat. Rev. Mol. Cell Biol.19481–482. 10.1038/s41580-018-0013-2
29
ChalkleyO.PurcellO.GriersonC.MarucciL. (2019). The genome design suite: enabling massive in-silico experiments to design genomes.bioRxiv [Preprint]. 10.1101/681270
- CrossRef
- Google Scholar
30
Cornish-BowdenA.HofmeyrJ. H. (1991). MetaModel: a program for modelling and control analysis of metabolic pathways on the IBM PC and compatibles.Comput. Appl. Biosci.789–93. 10.1093/bioinformatics/7.1.89
31
CourtotM.JutyN.KnupferC.WaltemathD.ZhukovaA.DragerA.et al (2011). Controlled vocabularies and semantics in systems biology.Mol. Syst. Biol.7:543. 10.1038/msb.2011.77
32
DanchinA. (2012). Scaling up synthetic biology: do not forget the chassis.FEBS Lett.5862129–2137. 10.1016/j.febslet.2011.12.024
33
DaninoT.PrindleA.KwongG. A.SkalakM.LiH.AllenK.et al (2015). Programmable probiotics for detection of cancer in urine.Sci. Transl. Med.7:289ra84. 10.1126/scitranslmed.aaa3519
34
DieselE.SchreiberM.van der MeerJ. R. (2009). Development of bacteria-based bioassays for arsenic detection in natural waters.Anal. Bioanal. Chem.394687–693. 10.1007/s00216-009-2785-x
35
El KarouiM.Hoyos-FlightM.FletcherL. (2019). Future trends in synthetic biology-a report.Front. Bioeng. Biotechnol.7:175. 10.3389/fbioe.2018.00175
36
ErgulerK.StumpfM. P. (2011). Practical limits for reverse engineering of dynamical systems: a statistical analysis of sensitivity and parameter inferability in systems biology models.Mol. Biosyst.71593–1602.
- Google Scholar
37
FeigM.HaradaR.MoriT.YuI.TakahashiK.SugitaY. (2015). Complete atomistic model of a bacterial cytoplasm for integrating physics, biochemistry, and systems biology.J. Mol. Graph. Model.581–9. 10.1016/j.jmgm.2015.02.004
38
FeigM.SugitaY. (2013). Reaching new levels of realism in modeling biological macromolecules in cellular environments.J. Mol. Graph. Model.45144–156. 10.1016/j.jmgm.2013.08.017
39
FeigM.SugitaY. (2019). Whole-cell models and simulations in molecular detail.Annu. Rev. Cell Dev. Biol.35191–211. 10.1146/annurev-cellbio-100617-062542
40
Fernandez-CastaneA.FeherT.CarbonellP.PauthenierC.FaulonJ. L. (2014). Computer-aided design for metabolic engineering.J. Biotechnol.192(Pt B), 302–313.
- Google Scholar
41
GawandP.Said AbukarF.VenayakN.PartowS.MotterA. E.MahadevanR. (2015). Sub-optimal phenotypes of double-knockout mutants of Escherichia coli depend on the order of gene deletions.Integr. Biol.7930–939. 10.1039/c5ib00096c
42
GerardC.GonzeD.GoldbeterA. (2009). Dependence of the period on the rate of protein degradation in minimal models for circadian oscillations.Philos. Trans. A Math. Phys. Eng. Sci.3674665–4683. 10.1098/rsta.2009.0133
43
GerardC.TysonJ. J.CoudreuseD.NovakB. (2015). Cell cycle control by a minimal Cdk network.PLoS Comput. Biol.11:e1004056. 10.1371/journal.pone.0004056
44
GerardC.TysonJ. J.NovakB. (2013). Minimal models for cell-cycle control based on competitive inhibition and multisite phosphorylations of Cdk substrates.Biophys. J.1041367–1379. 10.1016/j.bpj.2013.02.012
45
GlassJ. I.MerrymanC.WiseK. S.HutchisonC. A.IIISmithH. O. (2017). Minimal Cells-Real and imagined.Cold Spring Harb. Perspect. Biol.9:a023861. 10.1101/cshperspect.a023861
46
GoldbergA. P.SzigetiB.ChewY. H.SekarJ. A.RothY. D.KarrJ. R. (2018). Emerging whole-cell modeling principles and methods.Curr. Opin. Biotechnol.5197–102. 10.1016/j.copbio.2017.12.013
47
GoldbeterA. (1991). A minimal cascade model for the mitotic oscillator involving cyclin and cdc2 kinase.Proc. Natl. Acad. Sci. U.S.A.889107–9111. 10.1073/pnas.88.20.9107
48
GomideM. S.SalesT. T.BarrosL. R. C.LimiaC. G.de OliveiraM. A.FlorentinoL. H.et al (2020). Genetic switches designed for eukaryotic cells and controlled by serine integrases.Commun. Biol.3:255.
- Google Scholar
49
HartwellL. H.HopfieldJ. J.LeiblerS.MurrayA. W. (1999). From molecular to modular cell biology.Nature402(Suppl.), C47–C52.
- Google Scholar
50
HicksM.BachmannT. T.WangB. (2020). Synthetic biology enables programmable cell-based biosensors.Chemphyschem21:131. 10.1002/cphc.201901191
51
HirokawaY.KawanoH.Tanaka-MasudaK.NakamuraN.NakagawaA.ItoM.et al (2013). Genetic manipulations restored the growth fitness of reduced-genome Escherichia coli.J. Biosci. Bioeng.11652–58. 10.1016/j.jbiosc.2013.01.010
52
HuckaM.FinneyA.SauroH. M.BolouriH.DoyleJ. C.KitanoH.et al (2003). The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models.Bioinformatics19524–531.
- Google Scholar
53
HutchisonC. A.IIIChuangR. Y.NoskovV. N.Assad-GarciaN.DeerinckT. J.EllismanM. H.et al (2016). Design and synthesis of a minimal bacterial genome.Science351:aad6253.
- Google Scholar
54
HyeonC.ThirumalaiD. (2011). Capturing the essence of folding and functions of biomolecules using coarse-grained models.Nat. Commun.2:487.
- Google Scholar
55
IwadateY.HondaH.SatoH.HashimotoM.KatoJ. (2011). Oxidative stress sensitivity of engineered Escherichia coli cells with a reduced genome.FEMS Microbiol. Lett.32225–33. 10.1111/j.1574-6968.2011.02331.x
56
IyengarS. (2011). Symbolic Systems Biology: Theory and Methods.Burlington, MA: Jones and Bartlett Learning.
- Google Scholar
57
Jessop-FabreM. M.SonnenscheinN. (2019). Improving reproducibility in synthetic biology.Front. Bioeng. Biotechnol.7:18. 10.3389/fbioe.2018.0018
- CrossRef
- Google Scholar
58
KarrJ. R.SanghviJ. C.MacklinD. N.AroraA.CovertM. W. (2013). WholeCellKB: model organism databases for comprehensive whole-cell models.Nucleic Acids Res.41D787–D792.
- Google Scholar
59
KarrJ. R.SanghviJ. C.MacklinD. N.GutschowM. V.JacobsJ. M.BolivalB.Jr.et al (2012). A whole-cell computational model predicts phenotype from genotype.Cell150389–401. 10.1016/j.cell.2012.05.044
60
KarrJ. R.TakahashiK.FunahashiA. (2015a). The principles of whole-cell modeling.Curr. Opin. Microbiol.2718–24. 10.1016/j.mib.2015.06.004
61
KarrJ. R.WilliamsA. H.ZuckerJ. D.RaueA.SteiertB.TimmerJ.et al (2015b). Summary of the DREAM8 parameter estimation challenge: toward parameter identification for whole-cell models.PLoS Comput. Biol.11:e1004096. 10.1371/journal.pone.1004096
- CrossRef
- Google Scholar
62
KarzbrunE.ShinJ.Bar-ZivR. H.NoireauxV. (2011). Coarse-grained dynamics of protein synthesis in a cell-free system.Phys. Rev. Lett.106:048104.
- Google Scholar
63
KingZ. A.LuJ.DragerA.MillerP.FederowiczS.LermanJ. A.et al (2016). BiGG Models: a platform for integrating, standardizing and sharing genome-scale models.Nucleic Acids Res.44D515–D522.
- Google Scholar
64
KirkP.ThorneT.StumpfM. P. (2013). Model selection in systems and synthetic biology.Curr. Opin. Biotechnol.24767–774. 10.1016/j.copbio.2013.03.012
65
KitanoH. (2002). Computational systems biology.Nature420206–210.
- Google Scholar
66
KochM.FaulonJ.-L.BorkowskiO. (2018). Models for cell-free synthetic biology: make prototyping easier, better, and faster.Front. Bioeng. Biotechnol.6:182. 10.3389/fbioe.2018.00182
67
KolesnikovN.HastingsE.KeaysM.MelnichukO.TangY. A.WilliamsE.et al (2015). Array Express update–simplifying data submissions.Nucleic Acids Res.43D1113–D1116.
- Google Scholar
68
KotulaJ. W.KernsS. J.ShaketL. A.SirajL.CollinsJ. J.WayJ. C.et al (2014). Programmable bacteria detect and record an environmental signal in the mammalian gut.Proc. Natl. Acad. Sci. U.S.A.1114838–4843. 10.1073/pnas.1321321111
69
KwiatkowskaM.NormanG.ParkerD. (eds) (2011). PRISM 4.0: Verification of Probabilistic Real-Time Systems 2011.Berlin: Springer.
- Google Scholar
70
LandonS.Rees-GarbuttJ.MarucciL.GriersonC. (2019). Genome-driven cell engineering review: in vivo and in silico metabolic and genome engineering.Essays Biochem.63267–284. 10.1042/ebc20180045
71
Le NovereN.HuckaM.MiH.MoodieS.SchreiberF.SorokinA.et al (2009). The systems biology graphical notation.Nat. Biotechnol.27735–741.
- Google Scholar
72
LeeJ. M.GianchandaniE. P.EddyJ. A.PapinJ. A. (2008). Dynamic analysis of integrated signaling, metabolic, and regulatory networks.PLoS Comput. Biol.4:e1000086. 10.1371/journal.pcbi.1000086.g002
- CrossRef
- Google Scholar
73
LinC.JainS.KimH.Bar-JosephZ. (2017). Using neural networks for reducing the dimensions of single-cell RNA-Seq data.Nucleic Acids Res.45:e156. 10.1093/nar/gkx681
74
LinkeC.ChasapiA.Gonzalez-NovoA.Al SawadI.TognettiS.KlippE.et al (2017). A Clb/Cdk1-mediated regulation of Fkh2 synchronizes CLB expression in the budding yeast cell cycle.NPJ Syst. Biol. Appl.3:7.
- Google Scholar
75
LuT. K.BowersJ.KoerisM. S. (2013). Advancing bacteriophage-based microbial diagnostics with synthetic biology.Trends Biotechnol.31325–327. 10.1016/j.tibtech.2013.03.009
76
MaJ.YuM. K.FongS.OnoK.SageE.DemchakB.et al (2018). Using deep learning to model the hierarchical structure and function of a cell.Nat. Methods15290–298. 10.1038/nmeth.4627
77
MacklinD. N.Ahn-HorstT. A.ChoiH.RuggeroN. A.CarreraJ.MasonJ. C.et al (2020). Simultaneous cross-evaluation of heterogeneous E. coli datasets via mechanistic simulation.Science369:eaav3751. 10.1126/science.aav3751
78
MacklinD. N.RuggeroN. A.CovertM. W. (2014). The future of whole-cell modeling.Curr. Opin. Biotechnol.28111–115. 10.1016/j.copbio.2014.01.012
79
MarucciL.BartonD. A.CantoneI.RicciM. A.CosmaM. P.SantiniS.et al (2009). How to turn a genetic circuit into a synthetic tunable oscillator, or a bistable switch.PLoS One4:e8083. 10.1371/journal.pone.0008083
80
MatsuuraT.HosodaK.ShimizuY. (2018). Robustness of a reconstituted Escherichia coli protein translation system analyzed by computational modeling.ACS Synth. Biol.71964–1972. 10.1021/acssynbio.8b00228
81
McAdamsH. H.ArkinA. (1997). Stochastic mechanisms in gene expression.Proc. Natl. Acad. Sci. U.S.A.94814–819.
- Google Scholar
82
McCloskeyD.PalssonB. O.FeistA. M. (2013). Basic and applied uses of genome-scale metabolic network reconstructions of Escherichia coli.Mol. Syst. Biol.9:661. 10.1038/msb.2013.18
83
McGuffeeS. R.ElcockA. H. (2010). Diffusion, crowding & protein stability in a dynamic molecular model of the bacterial cytoplasm.PLoS Comput. Biol.6:e1000694. 10.1371/journal.pcbi.10000694
- CrossRef
- Google Scholar
84
MedleyJ. K.GoldbergA. P.KarrJ. R. (2016). Guidelines for reproducibly building and simulating systems biology models.IEEE Trans. Biomed. Eng.632015–2020. 10.1109/tbme.2016.2591960
85
MendozaS. N.OlivierB. G.MolenaarD.TeusinkB. (2019). A systematic assessment of current genome-scale metabolic reconstruction tools.Genome Biol.20:158.
- Google Scholar
86
MolM.KabraR.SinghS. (2018). Genome modularity and synthetic biology: engineering systems.Prog. Biophys. Mol. Biol.13243–51. 10.1016/j.pbiomolbio.2017.08.002
87
MondeelT.HollandP.NielsenJ.BarberisM. (2019). ChIP-exo analysis highlights Fkh1 and Fkh2 transcription factors as hubs that integrate multi-scale networks in budding yeast.Nucleic Acids Res.477825–7841. 10.1093/nar/gkz603
88
MondeelT.IvanovO.WesterhoffH. V.LiebermeisterW.BarberisM. (2020). Clb3-centered regulations are recurrent across distinct parameter regions in minimal autonomous cell cycle oscillator designs.NPJ Syst. Biol. Appl.6:8.
- Google Scholar
89
MooreS. J.MacDonaldJ. T.WieneckeS.IshwarbhaiA.TsipaA.AwR.et al (2018). Rapid acquisition and model-based analysis of cell-free transcription-translation reactions from nonmodel bacteria.Proc. Natl. Acad. Sci. U.S.A.115E4340–E4349.
- Google Scholar
90
Morton-FirthC. J.BrayD. (1998). Predicting temporal fluctuations in an intracellular signalling pathway.J. Theor. Biol.192117–128. 10.1006/jtbi.1997.0651
91
MoyaA.GilR.LatorreA.PeretoJ.Pilar Garcillan-BarciaM.de la CruzF. (2009). Toward minimal bacterial cells: evolution vs. design.FEMS Microbiol. Rev.33225–235. 10.1111/j.1574-6976.2008.00151.x
92
MutturiS. (2017). FOCuS: a metaheuristic algorithm for computing knockouts from genome-scale models for strain optimization.Mol. Biosyst.131355–1363. 10.1039/c7mb00204a
93
NoireauxV.Bar-ZivR.LibchaberA. (2003). Principles of cell-free genetic circuit assembly.Proc. Natl. Acad. Sci. U.S.A.10012672–12677. 10.1073/pnas.2135496100
94
NoskeA. B.CostinA. J.MorganG. P.MarshB. J. (2008). Expedited approaches to whole cell electron tomography and organelle mark-up in situ in high-pressure frozen pancreatic islets.J. Struct. Biol.161298–313. 10.1016/j.jsb.2007.09.015
95
NovakB.TysonJ. J. (1993). Numerical analysis of a comprehensive model of M-phase control in Xenopus oocyte extracts and intact embryos.J. Cell Sci.106(Pt 4), 1153–1168.
- Google Scholar
96
PanditA. V.SrinivasanS.MahadevanR. (2017). Redesigning metabolism based on orthogonality principles.Nat. Commun.8:15188.
- Google Scholar
97
PedoneE.PostiglioneL.AulicinoF.RoccaD. L.Montes-OlivasS.KhazimM.et al (2019). A tunable dual-input system for on-demand dynamic gene expression regulation.Nat. Commun.10:4481.
- Google Scholar
98
PerdikarisP.KarniadakisG. E. (2016). Model inversion via multi-fidelity Bayesian optimization: a new paradigm for parameter estimation in haemodynamics, and beyond.J. R. Soc. Interf.13:20151107. 10.1098/rsif.2015.1107
99
PosfaiG.PlunkettG.IIIFeherT.FrischD.KeilG. M.UmenhofferK.et al (2006). Emergent properties of reduced-genome Escherichia coli.Science3121044–1046. 10.1126/science.1126439
100
PrescottA. M.AbelS. M. (2017). Combining in silico evolution and nonlinear dimensionality reduction to redesign responses of signaling networks.Phys. Biol.13:066015. 10.1088/1478-3975/13/6/066015
- CrossRef
- Google Scholar
101
PrescottT. P.LangM.PapachristodoulouA. (2015). Quantification of interactions between dynamic cellular network functionalities by cascaded layering.PLoS Comput. Biol.11:e1004235. 10.1371/journal.pone.1004235
- CrossRef
- Google Scholar
102
PriceM. N.WetmoreK. M.WatersR. J.CallaghanM.RayJ.LiuH.et al (2018). Mutant phenotypes for thousands of bacterial genes of unknown function.Nature557503–509. 10.1038/s41586-018-0124-0
103
PurcellO.JainB.KarrJ. R.CovertM. W.LuT. K. (2013). Towards a whole-cell modeling approach for synthetic biology.Chaos23:025112. 10.1063/1.4811182
- CrossRef
- Google Scholar
104
PurcellO.SaveryN. J.GriersonC. S.di BernardoM. (2010). A comparative analysis of synthetic genetic oscillators.J. R. Soc. Interf.71503–1524. 10.1098/rsif.2010.0183
105
RancatiG.MoffatJ.TypasA.PavelkaN. (2018). Emerging and evolving concepts in gene essentiality.Nat. Rev. Genet.1934–49. 10.1038/nrg.2017.74
106
RandD. A. (2008). Mapping global sensitivity of cellular network dynamics: sensitivity heat maps and a global summation law.J. R. Soc. Interf.5(Suppl. 1), S59–S69.
- Google Scholar
107
RavaszE.SomeraA. L.MongruD. A.OltvaiZ. N.BarabasiA. L. (2002). Hierarchical organization of modularity in metabolic networks.Science2971551–1555. 10.1126/science.1073374
108
RayO.SohT.InoueK. (2011). “Analysing pathways using ASP-based approaches,” in Proceedings of the 2010 Conference on Algebraic and Numeric Biology, Berlin.
- Google Scholar
109
Rees-GarbuttJ.ChalkleyO.LandonS.PurcellO.MarucciL.GriersonC. (2020). Designing minimal genomes using whole-cell models.Nat. Commun.11:836.
- Google Scholar
110
ReussD. R.AltenbuchnerJ.MaderU.RathH.IschebeckT.SappaP. K.et al (2017). Large-scale reduction of the Bacillus subtilis genome: consequences for the transcriptional network, resource allocation, and metabolism.Genome Res.27289–299. 10.1101/gr.215293.116
111
RozanskiR.RayO.KingR.BragagliaS. (2015). “Automating development of metabolic network models,” in Computational Methods in Systems Biology. CMSB 2015. Lecture Notes in Computer Science, Vol. 9308edsRouxO.BourdonJ. (Cham: Springer).
- Google Scholar
112
SajedT.MarcuA.RamirezM.PonA.GuoA. C.KnoxC.et al (2016). ECMDB 2.0: A richer resource for understanding the biochemistry of E. coli.Nucleic Acids Res.44D495–D501.
- Google Scholar
113
SaliA.BermanH. M.SchwedeT.TrewhellaJ.KleywegtG.BurleyS. K.et al (2015). Outcome of the first wwPDB hybrid/integrative methods task force workshop.Structure231156–1167. 10.1016/j.str.2015.05.013
114
ShuJ.ShulerM. L. (1989). A mathematical model for the growth of a single cell of E. coli on a glucose/glutamine/ammonium medium.Biotechnol. Bioeng.331117–1126. 10.1002/bit.260330907
115
Siegal-GaskinsD.TuzaZ. A.KimJ.NoireauxV.MurrayR. M. (2014). Gene circuit performance characterization and resource usage in a cell-free “breadboard”.ACS Synth. Biol.3416–425. 10.1021/sb400203p
116
SilvermanA. D.KarimA. S.JewettM. C. (2020). Cell-free gene expression: an expanded repertoire of applications.Nat. Rev. Genet.21151–170. 10.1038/s41576-019-0186-3
117
SinglaJ.McClaryK. M.WhiteK. L.AlberF.SaliA.StevensR. C. (2018). Opportunities and challenges in building a spatiotemporal multi-scale model of the human pancreatic beta cell.Cell17311–19. 10.1016/j.cell.2018.03.014
118
SlomovicS.PardeeK.CollinsJ. J. (2015). Synthetic biology devices for in vitro and in vivo diagnostics.Proc. Natl. Acad. Sci. U.S.A.11214429–14435. 10.1073/pnas.1508521112
119
SmuckerB.KrzywinskiM.AltmanN. (2018). Optimal experimental design.Nat. Methods15559–560.
- Google Scholar
120
SolernouA.HansonB. S.RichardsonR. A.WelchR.ReadD. J.HarlenO. G.et al (2018). Fluctuating finite element analysis (FFEA): a continuum mechanics software tool for mesoscale simulation of biomolecules.PLoS Comput. Biol.14:e1005897. 10.1371/journal.pone.1005897
- CrossRef
- Google Scholar
121
StogbauerT.WindhagerL.ZimmerR.RadlerJ. O. (2012). Experiment and mathematical modeling of gene expression dynamics in a cell-free system.Integr. Biol.4494–501.
- Google Scholar
122
SzigetiB.RothY. D.SekarJ. A. P.GoldbergA. P.PochirajuS. C.KarrJ. R. (2018). A blueprint for human whole-cell modeling.Curr. Opin. Syst. Biol.78–15. 10.1016/j.coisb.2017.10.005
123
TakahashiK.ArjunanS. N.TomitaM. (2005). Space in systems biology of signaling pathways–towards intracellular molecular crowding in silico.FEBS Lett.5791783–1788. 10.1016/j.febslet.2005.01.072
124
TakahashiM. K.ChappellJ.HayesC. A.SunZ. Z.KimJ.SinghalV.et al (2015). Rapidly characterizing the fast dynamics of RNA genetic circuitry with cell-free transcription-translation (TX-TL) systems.ACS Synth. Biol.4503–515. 10.1021/sb400206c
125
TangP. W.ChuaP. S.ChongS. K.MohamadM. S.ChoonY. W.DerisS.et al (2015). A review of gene knockout strategies for microbial cells.Recent. Pat. Biotechnol.9176–197. 10.2174/1872208310666160517115047
126
ThieleI.JamshidiN.FlemingR. M.PalssonB. O. (2009). Genome-scale reconstruction of Escherichia coli’s transcriptional and translational machinery: a knowledge base, its mathematical formulation, and its functional characterization.PLoS Comput. Biol.5:e1000312. 10.1371/journal.pcbi.10000312
- CrossRef
- Google Scholar
127
ThulP. J.AkessonL.WikingM.MahdessianD.GeladakiA.Ait BlalH.et al (2017). A subcellular map of the human proteome.Science356:6340.
- Google Scholar
128
TomazouM.BarahonaM.PolizziK. M.StanG. B. (2018). Computational Re-design of synthetic genetic oscillators for independent amplitude and frequency modulation.Cell Syst.6:50.
- Google Scholar
129
TomitaM. (2001). Whole-cell simulation: a grand challenge of the 21st century.Trends Biotechnol.19205–210. 10.1016/s0167-7799(01)01636-5
- CrossRef
- Google Scholar
130
TomitaM.HashimotoK.TakahashiK.ShimizuT. S.MatsuzakiY.MiyoshiF.et al (1999). E-CELL: software environment for whole-cell simulation.Bioinformatics1572–84. 10.1093/bioinformatics/15.1.72
131
TysonJ. J. (1991). Modeling the cell division cycle: cdc2 and cyclin interactions.Proc. Natl. Acad. Sci. U.S.A.887328–7332. 10.1073/pnas.88.16.7328
132
UnderwoodK. A.SwartzJ. R.PuglisiJ. D. (2005). Quantitative polysome analysis identifies limitations in bacterial cell-free protein synthesis.Biotechnol. Bioeng.91425–435. 10.1002/bit.20529
133
UniProt ConsortiumT. (2018). UniProt: the universal protein knowledgebase.Nucleic Acids Res.462699. 10.1093/nar/gky092
134
van der MeerJ. R.BelkinS. (2010). Where microbiology meets microengineering: design and applications of reporter bacteria.Nat. Rev. Microbiol.8511–522. 10.1038/nrmicro2392
135
van der ZeeL.BarberisM. (2019). Advanced modeling of cellular proliferation: toward a multi-scale framework coupling cell cycle to metabolism by integrating logical and constraint-based models.Methods Mol. Biol.2049365–385. 10.1007/978-1-4939-9736-7_21
- CrossRef
- Google Scholar
136
VarmaA.PalssonB. O. (1994). Stoichiometric flux balance models quantitatively predict growth and metabolic by-product secretion in wild-type Escherichia coli W3110.Appl. Environ. Microbiol.603724–3731. 10.1128/aem.60.10.3724-3731.1994
137
VernonI.LiuJ.GoldsteinM.RoweJ.ToppingJ.LindseyK. (2018). Bayesian uncertainty analysis for complex systems biology models: emulation, global parameter searches and evaluation of gene functions.BMC Syst. Biol.12:1. 10.1186/s12918-017-0484-3
138
VilkhovoyM.HorvathN.ShihC. H.WaymanJ. A.CalhounK.SwartzJ.et al (2018). Sequence specific modeling of E. coli cell-free protein synthesis.ACS Synth. Biol.71844–1857. 10.1021/acssynbio.7b00465
139
VyshemirskyV.GirolamiM. (2008). BioBayes: a software package for bayesian inference in systems biology.Bioinformatics241933–1934. 10.1093/bioinformatics/btn338
140
WaltemathD.KarrJ. R.BergmannF. T.ChelliahV.HuckaM.KrantzM.et al (2016). Toward community standards and software for whole-cell modeling.IEEE Trans. Biomed. Eng.63:14.
- Google Scholar
141
WangL.MaranasC. D. (2018). MinGenome: an in silico top-down approach for the synthesis of minimized genomes.ACS Synth. Biol.7462–473. 10.1021/acssynbio.7b00296
142
WayJ. C.CollinsJ. J.KeaslingJ. D.SilverP. A. (2014). Integrating biological redesign: where synthetic biology came from and where it needs to go.Cell157151–161. 10.1016/j.cell.2014.02.039
143
WilkinsonD. J. (2007). Bayesian methods in bioinformatics and computational systems biology.Brief Bioinform.8109–116. 10.1093/bib/bbm007
144
WittigU.KaniaR.GolebiewskiM.ReyM.ShiL.JongL.et al (2012). SABIO-RK–database for biochemical reaction kinetics.Nucleic Acids Res.40D790–D796.
- Google Scholar
145
WoolstonB. M.EdgarS.StephanopoulosG. (2013). Metabolic engineering: past and future.Annu. Rev. Chem. Biomol. Eng.4259–288. 10.1146/annurev-chembioeng-061312-103312
146
XuX.LiuY.DuG.Ledesma-AmaroR.LiuL. (2020). Microbial chassis development for natural product biosynthesis.Trends Biotechnol.38779–796. 10.1016/j.tibtech.2020.01.002
147
YilmazL. S.WalhoutA. J. (2017). Metabolic network modeling with model organisms.Curr. Opin. Chem. Biol.3632–39. 10.1016/j.cbpa.2016.12.025
148
YuI.MoriT.AndoT.HaradaR.JungJ.SugitaY.et al (2016). Biomolecular interactions modulate macromolecular structure and dynamics in atomistic model of a bacterial cytoplasm.eLife5:e19274.
- Google Scholar
149
YuM. K.MaJ.FisherJ.KreisbergJ. F.RaphaelB. J.IdekerT. (2018). Visible machine learning for biomedicine.Cell1731562–1565. 10.1016/j.cell.2018.05.056
150
ZhouJ.WuR.XueX.QinZ. (2016). CasHRA (Cas9-facilitated homologous recombination assembly) method of constructing megabase-sized DNA.Nucleic Acids Res.44e124. 10.1093/nar/gkw475

Summary

Keywords

whole-cell models, synthetic biology, systems biology, multiscale models, bioengineering, biodesign

Citation

Marucci L, Barberis M, Karr J, Ray O, Race PR, de Souza Andrade M, Grierson C, Hoffmann SA, Landon S, Rech E, Rees-Garbutt J, Seabrook R, Shaw W and Woods C (2020) Computer-Aided Whole-Cell Design: Taking a Holistic Approach by Integrating Synthetic With Systems Biology. Front. Bioeng. Biotechnol. 8:942. doi: 10.3389/fbioe.2020.00942

Received

29 May 2020

Accepted

21 July 2020

Published

07 August 2020

Volume

8 - 2020

Edited by

Dong-Yup Lee, Sungkyunkwan University, South Korea

Reviewed by

Hyun Uk Kim, Korea Advanced Institute of Science and Technology, South Korea; Meiyappan Lakshmanan, Bioprocessing Technology Institute (A^∗STAR), Singapore

Updates

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Lucia Marucci, lucia.marucci@bristol.ac.ukMatteo Barberis, m.barberis@surrey.ac.uk; matteo@barberislab.comJonathan Karr, karr@mssm.eduOliver Ray, csxor@bristol.ac.ukPaul R. Race, Paul.Race@bristol.ac.ukClaire Grierson, claire.grierson@bristol.ac.ukElibio Rech, elibio.rech@embrapa.brRichard Seabrook, richard.seabrook@bristol.ac.ukChristopher Woods, Christopher.Woods@bristol.ac.uk

^†These authors have contributed equally to this work

This article was submitted to Synthetic Biology, a section of the journal Frontiers in Bioengineering and Biotechnology

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Synthetic Biology

PERSPECTIVE article

Computer-Aided Whole-Cell Design: Taking a Holistic Approach by Integrating Synthetic With Systems Biology

Abstract

Introduction

Whole-Cell Design Strategies in Synthetic Biology

Model Granularity of Gene Network (re)Design

Design and Engineering of Reduced Genomes

Design and Prototyping of Cell-Free Systems

Whole-Cell Biosensor Design and Testing

Industrial Implications of Whole-Cell Models

What’s Next? Going Beyond the Prototype

(re)Thinking System Approaches: A Collaborative Effort

Discussion

Statements

Author contributions

Funding

Acknowledgments

Conflict of interest

Footnotes

References

Summary

Outline

Figures

Cite article

Article metrics

PERSPECTIVE article

Computer-Aided Whole-Cell Design: Taking a Holistic Approach by Integrating Synthetic With Systems Biology

Abstract

Introduction

Whole-Cell Design Strategies in Synthetic Biology

Model Granularity of Gene Network (re)Design

Design and Engineering of Reduced Genomes

Design and Prototyping of Cell-Free Systems

Whole-Cell Biosensor Design and Testing

Industrial Implications of Whole-Cell Models

What’s Next? Going Beyond the Prototype

(re)Thinking System Approaches: A Collaborative Effort

Discussion

Statements

Author contributions

Funding

Acknowledgments

Conflict of interest

Footnotes

References

Summary

Outline

Figures

Cite article

Share article

Article metrics