Systems approaches for synthetic biology: a pathway toward mammalian design

We review methods of understanding cellular interactions through computation in order to guide the synthetic design of mammalian cells for translational applications, such as regenerative medicine and cancer therapies. In doing so, we argue that the challenges of engineering mammalian cells provide a prime opportunity to leverage advances in computational systems biology. We support this claim systematically, by addressing each of the principal challenges to existing synthetic bioengineering approaches—stochasticity, complexity, and scale—with specific methods and paradigms in systems biology. Moreover, we characterize a key set of diverse computational techniques, including agent-based modeling, Bayesian network analysis, graph theory, and Gillespie simulations, with specific utility toward synthetic biology. Lastly, we examine the mammalian applications of synthetic biology for medicine and health, and how computational systems biology can aid in the continued development of these applications.


INTRODUCTION AND OVERVIEW
Over the past three decades, rapid advances in computational power, subcellular data resolution, and the sophistication of bioengineering design has led to cellular machinery being increasingly controlled for practical application (Buetow, 2005;Cheng, 2007;Vendruscolo and Dobson, 2011). The advent of this field of "synthetic biology" has been touted as a reservoir of novel solutions for many of society's most pressing problems, including challenges in computing, health, and regenerative medicine (Gersbach et al., 2007;Lu et al., 2009;Ruder et al., 2011). For instance, the creation of the first-ever genetic toggle switch and the repressilator by synthetic biologists at the turn of the century allowed for an unprecedented degree of cellular control-and, in the case of the former, a digital state that could lay the groundwork for organic computing (Elowitz and Leibler, 2000;Gardner et al., 2000). In subsequent years, biologists constructed oscillators (capable of biological timekeeping), pulse generators (for transcellular signal transmission), and even signaling filters (for cellular signal processing) through carefully mapped gene circuits (Basu et al., 2004(Basu et al., , 2005Stricker et al., 2008;Khalil and Collins, 2010).
However, while each of these individual discoveries led to numerous applications of genetic engineering in biomedicine, we still lack tools with the robustness required for transformative applications. For instance, true "plug-and-play" cellular machines remain a work in progress, in part due to the heterogeneity and adaptability of biological networks (Kobayashi et al., 2004;Arkin, 2008). The routine engineering of mammalian cells, too, is still a distant possibility (Khalil and Collins, 2010). Because synthetic biology has largely been applied to microbes due to mammalian cell complexity, its impact on medicine has been limited.
Achieving these benchmarks is admittedly easier said than done. Whereas the promises and potential of the synthetic biology field lie in characterizing the cellular alphabet, the puzzle of words and sentences that define cell signaling and behavior currently present a higher order of complexity (Endy, 2005). Moreover, the field of synthetic biology is still in its infancy, compared to the equivalent of "the Wright brothers . . . putting pieces of wood and paper together" (Kwok, 2010). Some leading researchers have even suggested that "the complexity of synthetic biological systems over the past decade has reached a plateau" (Purnick and Weiss, 2009).
One way biologists have started to reinvigorate the field is through advances in combinatorial logic-based circuits (Lu et al., 2009;Wang et al., 2011;Michelotti et al., 2012;Wang and Buck, 2012). These formalisms possess the distinct advantages of providing a standardized framework that is adaptable across levels of abstraction as well as dynamical properties that can be estimated and combined by straightforward mathematical operations. Showing early progress, combinatorial logic-based circuits have been designed into sophisticated information processing tools in clonal mammalian cells like HeLa and MCF-7 Nevozhay et al., 2013). However, noise, heterogeneity, complexity of structure, and time-dependent rewiring across biological scales limit the degree of control enabled by these experimental methods (Purnick and Weiss, 2009;Kwok, 2010). We propose that these challenges can be tackled by capitalizing on advances in computational systems biology that are uniquely valuable for synthetic cell design. We argue that a new perspective on the role of systems modeling in synthetic biology can promote the development of new therapies for human health by enabling the complex design capability required for mammalian cell engineering.

COMPUTATIONAL TECHNIQUES AND ADVANCES: SYSTEMS BIOLOGY APPLICATIONS
Computational methods are widely employed within synthetic biology as design tools, providing simulations of bioengineered systems in advance of their cellular assembly (Chandran et al., 2009;Ellis et al., 2009;Purnick and Weiss, 2009;Smolke and Silver, 2011) (Figure 1). Historically, these coupled computational-experimental approaches have contributed to many of the "milestone" discoveries in the field over the past two decades (Table 1). However, modeling used in synthetic biology until now has been generally limited to biocircuits and control systems, in part because the field emerged from genetic engineering where circuit representations are common (Mukherji and van Oudenaarden, 2009). The consequence of this limited paradigm is that significant advances in human health from this field remain out of reach, as gene circuit models prove to be increasingly insufficient for characterizing mammalian cell behavior (Purnick and Weiss, 2009).
In the aggregate, this "insufficiency" stems from a set of core properties of biological systems that current synthetic approaches do not fully capture: (1) scale, with the need to elicit controlled behavior across cell, tissue, and organ levels (Miller et al., 2012); (2) simultaneity, as defined by the highly networked nature of cell signaling (Jeong et al., 2000(Jeong et al., , 2001Marcotte, 2001); (3) state adaptation dynamics, or the non-linear temporal fluctuations of such networks (Slusarczyk et al., 2012); (4) shape, due to the relevance of cell morphology in defining environmental interactions (Ben-Ze'ev et al., 1988;Singhvi et al., 1994); (5) stochasticity, with noise and randomness being significant determinants of cellular behavior (Thattai and van Oudenaarden, 2001;Pedraza and van Oudenaarden, 2005;Chopra and Kamma, 2006;Purnick and Weiss, 2009); and (6) spatial dependencies, both intracellular and extracellular in nature (Andrianantoandro et al., 2006). With the present focus on microbial engineering, many of these characteristics can be safely neglected; at mammalian levels of complexity, they render the behavior of synthetic systems difficult to predict a priori.
Although these challenges are manifold, they are not insurmountable. The answers may lie in systems biology. This computational discipline seeks to shift the basic molecular biology paradigm from isolation to coordination: from characterizing individual components of cell behavior to analyzing how these components function in tandem (Kitano, 2002a,b). Accordingly, systems bioengineers bring a diverse array of computational modeling techniques-drawing on mathematics, computer science, and engineering-to bear on questions of both mechanism and design at the cell and tissue levels (Kitano, 2001;Alon, 2007). In doing so, the field provides computational tools to characterize behavioral patterns at the cellular level that will be the building blocks of more sophisticated synthetic design. Systems biology approaches are particularly powerful in characterizing cell-cell interactions across scales, such as in capillary patterning and organ development, where the gene circuits approach in synthetic biology has proven limited in capturing adaptation, cellular heterogeneity and spatial hierarchy (Yingling et al., 2005;Qutub et al., 2009;Long et al., 2013). As such, many of the challenges to applying synthetic biology toward controlling mammalian tissue can be addressed in part by methods and techniques that are well-developed in systems bioengineering. Here, we discuss each of these roadblocks categorically, with the associated tools to address them.

SCALE: HIERARCHICAL, AGENT-BASED MODELING, AND RULE-BASED FORMALISMS
Characterizing population-level emergent behavior and cell-cell heterogeneity has long been recognized as a principal goal and challenge in synthetic biology (Canton et al., 2008;Neumann and Neumann-Staubitz, 2010;Young and Alper, 2010). Traditional synthetic designs have assumed identical expression patterns across a cell population, as the standard biocircuit framework does not permit the simulation of cell behavioral variability (Elowitz et al., 2002;Ozbudak et al., 2002;Blake et al., 2003;You et al., 2004). Moreover, the limited capacity for gene circuit models to characterize emergent behavior-defined formally as patterns that emerge from a myriad of relatively simple interactions-inhibits scale-dependent design (Benner and Sismour, 2005). As a mammalian example, if intricate cerebral function results from the coordinated function of millions of individual neurons, any synthetic design applied to the brain must first require accurate simulations of how neural cell-level changes

Frontiers in Physiology | Computational Physiology and Medicine
October 2013 | Volume 4 | Article 285 | 2 manifest on the cerebral tissue-level. Deriving such scale-driven causal links from observed principles is non-trivial at best. One systems biology method to address the research challenge of emergence in biology is agent-based modeling. The approach has a simple premise: such systems exhibit emergent behavior that arises from the interactions between individual actors (or agents) and, consequently, would be impossible to know a priori (Chandran et al., 2009). An agent is defined here as a discrete entity that has behavior, can adapt, carries "genetic codes," holds variables and data, is governed by individual rules, and is spatially defined. Fundamentally, this class of modeling method diverges from biocircuit models, which typically characterize fluctuations in state variables governed by differential relationships. Supplanting the latter's top-down, intracellular perspective with the former's bottom-up, multi-scale viewpoint permits the simulation of heterogeneity while eliminating the need to derive inter-scale relationships beforehand (Chandran et al., 2009).
Notably, agent-based modeling encompasses a broad range of variations in implementation, rather than any specific algorithm or rule-set. Existing libraries, such as MASON, Repast, and Swarm, allow for the construction of multi-scale agentbased models atop adaptable frameworks, facilitating their use by synthetic biologists with limited prior exposure to the technique. This methodology has been employed toward modeling brain capillary regeneration , immunological and inflammatory responses (Bailey et al., 2007;Chandran et al., 2009;Pothen et al., 2013), and cancer progression Basanta et al., 2012;Wodarz et al., 2012;Walker and Southgate, 2013), among other topics. Within synthetic biology specifically, agent-based models have also simulated tissue development, tissue formation, and microbial chemotaxis (Endler et al., 2009).
Similarly rule-based formalisms are also being applied to coarse-grain patterns in chemical-kinetic models (Feret et al., 2009;Yang et al., 2010), providing scalable tools to describe complex interactions in cellular systems that begin at the molecular level.

SIMULTANEITY: GRAPH THEORY AND NETWORK ANALYSIS
Forecasting interactions and dynamics in protein and metabolic pathways is crucial for fine-tuned control of mammalian synthetic bioengineering. Whereas traditional kinetic-and gene circuitbased methods use simplified pathways to represent these signaling dynamics, in many cases the relationships between molecules are highly non-linear (Marcotte, 2001) and multiplex, i.e., multiple inputs combine to a single output. Signals that propagate from A->B->C at regular intervals are rare; more common are those for which such variations as A->C<->B and B->A->C->A dictate the targeted result, with time-and state-dependent transitions (Kestler and Kühl, 2008). Common, too, are linkages between parallel molecular pathways that each simultaneously affect the output of the other (Jeong et al., 2000(Jeong et al., , 2001. These oscillations render more complex cell pathways intractable for traditional biocircuit methods, which are generally based on small set of ordinary differential equations (ODEs) (Kestler and Kühl, 2008).
Several types of network models allow for better predictive simulation of these multiplex interactions. Graph methods, for example, are a class of models that represent pathway components as networked nodes, and graph-based approaches have been used to model cellular machinery including genes, proteins and other subcellular compartments (Ma'ayan et al., 2005;Pe'er, 2005). The interactions between components are drawn as edge connections between the relevant nodes (Ma'ayan et al., 2005). Graph-based models vary in implementation to capture different kinds of molecular relationships (e.g., Boolean gene expression, stochastic transitions between molecular states), but are all particularly adept at identifying complex network modules, or certain structural features that "dominate" the behavior of the larger network. In mammalian cells, as an example, researchers have had early success in characterizing the dynamics of key feedforward modules and motifs, helping to enable the circuit design of adaptive gene expression (Bleris et al., 2011).
One common type of acyclic graph method, known as Bayesian Network Analysis, is a form of directed statistical modeling designed to capture conditional dependencies between probabilistic events (Pe'er, 2005). In a Bayesian network model, probabilities define the relationship between the current node and its predecessor or parent in a graph (Alterovitz et al., 2007). Markov models are another network-based technique that can provide a framework to describe molecular or cellular states and the weighted probability of transitioning between them. The power of these methods lies in their ability to facilitate the reverse engineering of multiplex networks based on molecular expression, molecular activity and/or cell behavior data, serving as a precursor to synthetic modifications of existing molecular pathways (Barnes et al., 2011). However, for gene or protein pathways with more complex topologysuch as those examples offered above-cyclic graph models might be necessary, for which a variety of analytical tools and approaches are described by computational biologists in the literature (particularly from research on neural networks) (Bianchini et al., 2006;Scarselli et al., 2009;Bowsher, 2010;Bonnet et al., 2013).

STATE ADAPTATION DYNAMICS: EVOLUTIONARY MODELS, OPTIMIZATION ALGORITHMS
In parallel with the above techniques, another suite of computational methods permits not only the analysis of cellular pathways, but also directly facilitates their synthetic design. Known as evolutionary algorithms, these methods can predict state changes in the behavior of signaling pathways over time, through adaptation or random mutation, by modeling this rewiring directly (Hallinan et al., 2010;Chen et al., 2011;Mobashir et al., 2012). In the same vein, these methods allow for the de novo construction and optimization of genetic networks by way of simulation (Bloom and Arnold, 2009), "evolving" a set of viable pathway designs that meet the specified constraints (Hallinan et al., 2010). Though these algorithms vary in construction, a subset of methods known as genetic algorithms-in which populations of potential networks "compete" against each other-are of particular utility to synthetic biologists due to their ease-of-implementation (Mitchell, 1998). Many alternative optimization techniques exist, e.g., simulated annealing, hill climbing, and gradient descent, which can be applied to optimize synthetic network architectures and the design of synthetic constructs (Zomaya and Kazman, 2010). In addition to these, combinatorial "tuning" strategies have been successfully applied toward model-guided, programmable control of gene expression in mammalian cells via RNAi (Beisel et al., 2008). A unique advantage of evolutionary and optimization algorithms is their ability to (A) be applied broadly to many forms of models, including ODEs and rule-based simulations and (B) generate a diverse array of functional network topologies.

SHAPE: MORPHOLOGICAL MODELING AND COMPUTATIONAL CELL PHENOTYPING
Thus, far, synthetic biology research has largely omitted studies on cell shape. The few exceptions in the literature focus on morphological properties as reporters for specific signaling cascades or to control specific spatial features (Yeh et al., 2007;Tanaka and Yi, 2009). For instance, one recent work described controlled shape changes of synthetic yeast cells (Tanaka and Yi, 2009). Rather than modeling how a gene circuit would induce specific cell morphology a priori, the study's authors varied αfactor pathway inputs to observe shape changes until the desired shape was achieved-in this case, one that upregulated the formation of mating projections (Tanaka and Yi, 2009). Another study scored filopodial and lamellipodial phenotypes as indicators for successful synthetic rewiring of Rho GTPase signaling (Yeh et al., 2007).
Despite the few studies in this area, cell morphology is often a characteristic of central importance to synthetic biology experiments. For instance, synthetic systems seeking to modulate cell-cell interactions must necessarily account for morphological and spatial-dependent interactions between cells (Ben-Ze'ev et al., 1988;Singhvi et al., 1994). These membrane adjacency and receptor localization are drivers of pathways like Delta-Notch signaling, in which a signaling cascade is triggered by the binding of two transmembrane proteins on adjacent cells (Appel et al., 2001). Moreover, cell behavior-and at a higher scale, tissue functionality-is often predicated on geometry (Haeuptle et al., 1983). For example, optimizing a synthetic cell for metabolic filtration necessitates that its membrane surface area be maximized for nutrient exchange, such as through inward folds (Gahan, 2005). Doing so requires leveraging computational modeling to predict three-dimensional shape response to changes in genetic circuit design.
Examples of methods for geometrical-rendered modeling of cells include tensegrity models, Voronoi-based simulations, and molecular dynamics models. The concept of "tensegrity" stems from geodesic design, in which an object's shape is maintained through the joint effect of structural members in continuous tension and those in discontinuous compression (Huang et al., 2006). Though abstract in concept, computational models of tensegrity have been demonstrated to approximate cell shape and mechanics, providing a representation for simulating cell morphology in vitro (Huang et al., 2006). Tensegrity principles have been used to represent cytoskeletal elements, allowing for changes in these proteins induced by regulatory networks (e.g., focal adhesion kinases) to be assessed for their effects on cell shape (Kardas et al., 2013). An alternative geometrical model is the Voronoi diagram, a mathematical concept of dividing space into distinct regions based on proximity to initial seed points. Voronoi diagrams provide a useful means of constraining complex cell shapes into adjacent spatial tessellations, a technique particularly useful to study patterning at the cell population-or tissue-level (Schaller and Meyer-Hermann, 2005;Luengo-Oroz et al., 2008). Lastly, molecular dynamics simulations of cell shape represent cells as collections of individual molecules in Newtonian motion, either abstractly (as particles) or concretely (as cytoskeletal elements), to model an agglomerated cellular structure at high resolution-albeit at greater computational cost (Rapaport, 2004;Pfaendtner et al., 2010).
Linking geometric-based models to gene network simulations offers the opportunity to guide synthetic biocircuit design in silico such that specific cell morphologies can be engineered. Previously, this method has led to a complete representation of osteocyte cytoskeleton dynamics (Kardas et al., 2013). In conjunction, computational cell phenotyping enables changes in morphology to be quantitatively measured and tracked, such that the desired design can be achieved. Phenotyping techniques couple high-fidelity cell imaging with processing metrics to parse shape information (Chung et al., 2008;Sozzani and Benfey, 2011;Ryan et al., 2013). These shape metrics can facilitate the computer-aided design of synthetic networks.

STOCHASTICITY: GILLESPIE ALGORITHM AND MONTE CARLO METHODS
Perhaps the most significant research challenge in synthetic bioengineering is enabling the design of cellular systems that are robust to biological stochasticity (Chopra and Kamma, 2006;Purnick and Weiss, 2009). Existing gene circuit models are largely deterministic, behaving in highly reproducible ways. These models, as alluded to previously, present regulatory networks as homogeneous concentrations of molecules modulated by parameterized rate constants through coupled differential equations.
Yet there exists increasing evidence that biological networks and intracellular behavior are innately stochastic (Thattai and van Oudenaarden, 2001). Whereas noise effects are often assumed to be negligible at the population level, noise can play a significant role at the single-cell level, e.g., where a small number of molecular interactions may trigger a cascade of downstream protein signaling (Thattai and van Oudenaarden, 2001;Pedraza and van Oudenaarden, 2005). Furthermore, research indicates the phenomenon of noise propagation, in which cell-level stochasticity can accrue at the population-level to create emergent behavior that deviates substantially from the desired target, a phenomena recently documented in E. Coli, leading to a loss of synchrony between cells (Hooshangi et al., 2005;Hornung and Barkai, 2008). Such studies suggest that complex synthetic systems cannot be engineered without first accounting for stochasticity in the circuit design.
Fortunately, there exist a wide variety of computational techniques to capture and predict this biological stochasticity at the systems level. One specific approach, known as the Gillespie Algorithm, rejects the deterministic ODE approach of modeling chemical-kinetics in favor of stochastic representations of molecular interactions (Gillespie, 2007). This algorithm explicitly simulates each "reaction" (or interaction event) along a network, with the probability of a successful "reaction" dependent on both the rate properties and a random walk (Gillespie, 2007). For synthetic biology applications, these reactions can be defined as discrete regulatory steps along a specific gene circuit, allowing the effects of noise along the circuit to be well-characterized.
The Gillespie algorithm belongs to a larger class of stochastic modeling techniques known as Monte Carlo methods, which can be adapted to suit the needs of a specific biocircuit design (Athale, 2001). Monte Carlo methods, while varied in implementation, share the property of employing random simulations over many iterations to quantify properties of biological systems.

SPATIAL DEPENDENCIES: PARTICLE-AND LATTICE-BASED METHODS
Traditional synthetic biology designs are based on assumptions of biochemically homogenous cell interiors, but for gene circuit designs of higher complexity, this set of assumptions is unlikely to hold (Agapakis et al., 2012). Often, the spatial information associated with a protein or pathway inside the cell can influence the end-behavior of a molecular network (Agapakis et al., 2012). In addition to variations in metabolic conditions (e.g., pH levels), spatial cues can also present as receptor-or organelle-localization, intracellular polarity, and even topological sequestration (Harold, 1991;Roze et al., 2011;Lee et al., 2012). Characterizing intracellular spatial dependences and molecular dynamics becomes particularly important in mammalian cells, for which fine spatial organization of regulatory pathways is commonplace.
To this end, particle-and lattice-based computational techniques can be employed to model spatial systems within a synthetic cell (Spicher et al., 2011;Klann and Koeppl, 2012). Rather than simulate bulk flow, particle-based models track molecules separately and in discrete quantities (Takahashi et al., 2005), as alluded to above in the description of molecular dynamic models (see the Shape section). In systems biology, such methods have already been applied toward characterizing single-cell gradient sensing in the presence of multiple competitive ligands (Liou and Chen, 2012). Particle models could be similarly applicable to synthetic biology in engineering mammalian cells to function as fine-tuned hypoxic or nitric oxide sensors, in an effort to minimize effects of ischemic stroke-to name just one instance.
The complexity of particle models is mitigated by the availability of open source simulators, including E-Cell and ChemCell (Klann and Koeppl, 2012). Many of these implementations also allow particle simulations to be combined with models of other classes. As an example, a spatial derivative of the Gillespie algorithm can integrate stochastic modeling with space-dependent computation (Takahashi et al., 2005).
Spatial modeling can also be performed using PDE models; examples include gene circuits defining chemical diffusionmediated interactions between localized cell populations (Song et al., 2009). In other applications to synthetic biology, these spatial techniques have been combined with mechanistic models, such as kinetic RNA folding simulations, to provide fine-tuned control of gene expression along a specific component of a regulatory pathway (Carothers et al., 2011). PDE formalisms offer relative simplicity of construction compared to other spatiotemporal methods, with the caveat of not being well-suited to highly heterogeneous spatial environments.

APPLICATIONS FOR MAMMALIAN CELLS AND HUMAN HEALTH
Until now, the overwhelming focus of research and progress in synthetic biology has been on prokaryotic cells: mostly bacteria (commonly E. Coli) and assorted microbes. This is a natural consequence of the knowledge gap described previously; prokaryotic cells are orders of magnitude simpler than eukaryotic ones-not to mention easier to manipulate. They could be said to represent the "crawling" stage of synthetic biology. However, if the ultimate goal of the discipline is to uncover novel therapeutic targets and treatments in biomedicine, such strict characterization of nonmammalian systems will restrain our ability to advance human health. In the end, we must learn to walk. To do so means confronting the complexity that the in vivo mammalian system brings. Methods already employed in systems biology to characterize this complexity can open up the boundaries of modern medicine. As an example, it is not difficult to imagine a future where computational models enable the design of synthetic neural progenitor cells programmed to promote recovery post-ischemic stroke. To foster an era of personalized medicine, this potential could revolutionize the manner in which we approach tissue engineering: cells grown en masse, and then programmed to meet the specific needs of the patient. Moreover, such customizable cells would permit targeted regeneration to a degree that simple stem cell treatments cannot achieve. Such innovations, while distant, are attainable, but they necessitate the coupling of systems approaches with synthetic biology.

CONCLUDING COMMENTS
Bringing the sister disciplines of synthetic and systems biology closer together could recast the gene circuit paradigm, and enhance our ability to engineer and program cells for applications across energy, computing and biomedicine. Leveraging a computational toolkit refined by systems biologists for the last half-century offers a unique catalyst that to help pave the future of synthetic biology.