Agent-based models in cellular systems

This mini-review discusses agent-based models as modeling techniques for studying pattern formation of multi-cellular systems in biology. We introduce and compare different agent-based model frameworks with respect to spatial representation, microenvironment, intracellular and extracellular reactions, cellular properties, implementation, and practical use. The guiding criteria for the considered selection of agent-based model frameworks are that they are actively maintained, well documented, and provide a model development workflow.

from data, and how to explore their behavior, e.g., sensitivity analysis, bifurcation analysis, see, e.g., [11][12][13][14][15][16]. PDEs are powerful tools, however, in a PDE cells have no spatial extension, ignoring the underlying cellular spatial structure. Although it is possible in a PDE to distinguish between the inside of the cells and their micro-environment, there is no unique or canonical way to handle, e.g., cell proliferation, differentiation, internal cell structure and other properties of the cells, which may be relevant for the questions at hand. This problem can be solved by cellular Automata (CA) [17] which consist of a regular grid with a finite number of states at each grid point and rules, determining how to update them accordingly. A further development of CA introduced ABMs [18][19][20] where the modeling approach is to handle cells as individual agents with rules, potentially with an internal structure and/or moving in space. By using coarse-graining or homogenization techniques one could derive a system of coupled PDEs from an ABM; there is, however, no unique way to go from a PDE to an ABM [21].

Agent-based models
An ABM is a collection of autonomous agents with a predefined set of rules, which depend on the existing state of the agent and external factors [22][23][24][25]. The rules can be discrete following logical if-else statements, continuous, i.e., Ordinary Differential Equations (ODEs) for intra-cellular reactions or a combination of both. Also, graphs, neural networks and other intricate algorithms can be implemented [26]. Nevertheless, one usually strives to employ the most simple set of rules sufficient to accurately describe the complexity of the desired system. Compared to macroscopic PDE models, ABMs are considered microscopic modeling, since they deal with agents directly and are thus more common in a bottom-up approach [27]. ABMs should not be seen as a technologically distinct toolset but rather as a mindset for researchers by modeling complex systems from the perspective of individual constituents.
Historically, precursors to ABMs were cellular Automata, which were developed by [17]. They reached widespread recognition even in the general public with the introduction of Conway's "Game of Life" [28,29]. Not long after, the first ABMs were being envisioned to study a biological system [18]. Up until the break of the century, ABMs were used in many fields of research such as modeling human crowd stampedes [30], bird flocks [19] or the prediction of financial markets [31]. With the rapidly growing accessibility and power of modern computer hardware, the popularity of ABMs kept on increasing, where tools such as NanoHUB [32] or the Systems Biology Markup Language (SBML) [33] further helped to share computational models between researchers. In order to study complex phenomena such as pattern formation ABMs must be able to capture cell-cell communication and cellular response mechanisms [10,[34][35][36][37]. In the next section we will compare the available ABM frameworks and discuss how they cover different cellular properties.

Comparison of ABMs
The effort of writing efficient solving algorithms and data structures in a usable fashion is considerable. Therefore, agentbased model frameworks (ABMFs) have emerged that define a certain workflow and implement a set of features, so that users of the frameworks can focus on their research question instead of having to spend a significant amount of time for design and implementation.
The majority of cell-agent-based model frameworks (CABMFs) evolved as generalizations of solutions to specific problems. BSim [38] was specifically designed to model bacterial populations and has been used to study gene regulatory control [39] and bacterial biofilms [40]. Chaste 2 was designed as a Cancer, Heart and Soft Tissue Environment [41] and has been used in studying growth of epithelial monolayers [42]. CompuCell3D [43] originated from CompuCell [44], which was one of the first frameworks created and originally used to model only simple reaction-diffusion (RD) systems but was since extended considerably to cover a wider range of topics such as angiogenesis [45], cancer [46] and tissue engineering [47]. EPISIM was used to understand how varying proportions of T Cells emerge in different vertebrate taxa [48]. Morpheus [49] was applied to self-organization in neural stem cell divisions in adult zebrafish [50] and polarization of the multiciliated planarian epidermis [51]. MultiCellSim [52] resulted from the in-depth analysis of cell-cell communication and was since applied to Immuno-Oncology [53]. PhysiCell [54] is mainly used modeling cancer and tumor dynamics [55,56]. TiSim/ CellSys [57] was applied to liver regeneration processes [58]. VirtualLeaf [59] was specifically designed for modeling plants and emphasizes intercellular connections and details of the mechanical properties of the cell wall. Table 1 displays general characteristics of these modeling frameworks.

Spatial representation
A key distinction between ABMs is given by the difference of the spatial representation of cells and chemicals. ABMs can be separated into lattice-based and lattice-free, the former meaning that cells can only migrate between predefined lattice nodes, while the later permits free movement of cells in a given domain. Frameworks such as Chaste, PhysiCell, TiSim/CellSys and VirtualLeaf utilize off-lattice motion. Chaste 3 , CompuCell3D and Morpheus utilize lattice-based methods for cell-migration. This also means that no particular cellular shape is modeled explicitly, but rather cells follow rules (often potentials) to determine their respective quantity on lattice points. The disadvantage of the lattice-based approach is that it is limited in the spatial resolution, but in turn as an advantage it can yield considerable performance improvements. Off-lattice models often take a cell centre [60] approach meaning, a cell is defined by a single location vector and a shape (such as sphere, ellipsoid or cylinder) that governs interactions. BSim additionally has the ability to represent microbes as meshed objects thus offering a much higher resolution at micro-scale although at increased computational cost. Another less common modeling choice is to use a vertex model [61, 62] that represents each cell by a polygon, determined by a number of vertices, which can be subject to external forces, pressure, friction, adhesion, chemotaxis and other external and internal contributing factors. Lattice-bound models can utilize different discretizations such as regular Cartesian meshes, hexagonal or triangulated ones. Most of the presented frameworks in Table 1 can be used to simulate two-dimensional (2D) as well as three-dimensional (3D) scenarios. The Cellular Potts Model, also known as Glazier-Graner-Hogeweg (GGH) model [63,64], is a common choice for many frameworks. Typically, in a Cellular Potts Model a Hamiltonian is formulated which describes the phenomenological "energy" of a given configuration of the system on a Euclidean lattice. Subsequently, the systems is evolved by minimizing the energy. LBIBCell modifies the classical Cellular Potts Model (CPM) approach by representing cells as evolving polygons with the immersed boundary method and thus obtains off-lattice cellular representations [65, 66].

External microenvironment
Transport processes of chemicals typically involve numerically solving (convection-) diffusion equations (67) and (68) with cell to extracellular matrix interaction nodes at the positions of the cellular agents on a (often euclidean) mesh. One exception is presented by VirtualLeaf where intracellular compartments are connected via membranes to adjacent cells and model transport through membrane-potentials [59]. Many ABMs utilize PDEs to model intracellular or extracellular transport processes such as convection and diffusion and allow for custom forms of reactions either via well-defined user-interfaces like Morpheus [49] or direct implementation into the source code.

Cellular processes
In an agent-based approach the processes occuring inside a cell can naturally be described by giving the agents the required set of functions. Each framework mentioned in Table 1 implements proliferation and cell-death mechanisms as key components. However, predefined and detailed cell-cycle routines such as utilized in PhysiCell [54] are less common, but are important to consider if, e.g., external factors such as growth hormones affect the cellcycle [69]. In addition, internal chemicals may be released upon cell death. In order to model developmental processes such as embryogenesis, the framework needs to support celldifferentiation with dynamic modifications of the phenotype. Cell polarity can play an important role in many phenomena such as in ciliary rootlets in planarian epidermis [51]. Many frameworks like CompuCell3D, Chaste, Morpheus, VirtualLeaf support this feature. The geometry of the cell includes its spatial representation together with mechanical features such as adhesion and repulsion. PhysiCell utilize spheroid/ellipsoid cellular geometries, meaning each cell is represented by a sphere or ellipsoid and a corresponding potential. Further, adhesion plays an important role in cellcell interactions and communication. Lattice-free frameworks often model it by choosing a particular form of interaction potential. One sophisticated example is the experimental Johnson-Kendall-Roberts (JKR) potential [70], which was derived from the Hertz contact model [71]. It also models cell separation and is implemented by CellSys. Other frameworks that implement a CPM treat adhesion via interaction terms in its Hamiltonian Formulation [72]. In the context of vertex models, force potentials can also be utilized although the implementation is often more complex.

Implementational details 2.2.1 Development, standards and features
Development and design of efficient algorithms and their implementation require knowledge in software engineering and in writing maintainable code, as these frameworks are usually developed by teams rather than by individuals and consist of many thousands of lines of code. The Chaste framework was one of the first projects to follow agile coding principles and other best-practice workflows such as rigorous unit-testing [76]. All presented CABMFs are written in C++ which together with the C and Fortran language have historically served as the de facto languages for high-performance software development. In addition to CABMFs, researchers have over the last two decades developed internationally recognized formats to seamlessly share model details (e.g., SMBL). This is utilized in Chaste, CompuCell3D, Morpheus and PhysiCell 4 and allows for rapid model development, implementation and comparison to classical ODE and PDE solvers. CompuCell3D is also able to model physiologically based pharmacokinetics (PBPKs). Additionally, many frameworks come with dedicated [sometimes graphical user interfaces (GUIs)] tools for configuration, analysis, batch-processing, visualization and other workflow-aiding features which are valuable additions. In this regard, EPISIM is special as it utilizes the popular COPASI [77] and Mason [78] software and plugins for the eclipse code editor [79] to build the application.
3 Studying pattern formation with agent-based models 3

.1 Applications
Pattern formation in cellular biological systems can occur via self-assembly or self-organization and ABMs have been applied to investigate both aspects. Chaste was used to study cell migration in the crypt [80]. Furthermore, CompuCell3D provided examples for self-organization in work on polarization [81] and studies of physical forces [82] in migrating cells. Morpheus was used to describe pattern formations in the telencephalon of adult zebrafish [83] and was also used to study growth of the Drosophila wing via cell recruitment [84]. PhysiCell recently provided insights to formation of patterns in tumour spheroids [56]. Pattern formation in dicot leaves was modeled using VirtualLeaf [85]. ABMs allow researchers to examine complicated models which would otherwise be hard to study and interpret with classical PDEs. Figure 1 shows results of a multi-scale model using PhysiCell [54]. We can observe that the pattern changes as the number of patterning cells (type I) increases. This simple example shows, how to readily formulate and explore models in an ABM mindset -by increasing the cell number in this case. Constructing a corresponding PDE model is much harder and not uniquely defined.

Techniques and challenges
CABMFs allow researchers to investigate biological systems on the cellular level with the option to implement many details, with the downside of substantial computational cost. To combat this issue, all presented frameworks are of multi-scale nature.
The relevant time-and length-scales are identified and the corresponding sub-processes are modeled and updated according to their scales. This can greatly improve performance as for example diffusion-driven processes tend to be much faster than cell migrational or phenotypical processes [86]. Other techniques to improve performance are efficient O(N cells ) implementations of algorithms [87] to calculate direct cell-cell interaction partners [], spreading the computational load over multiple processes via multiprocessing (for example via OpenMP [88]) or on specialized devices such as solving PDEs on a graphics processing unit (GPU) [89]. Due to the stochastic nature of the ABM simulations, appropriate statistical methods need to be applied, which is often challenged by the fact that transient developmental processes are studied not necessarily reaching a stationary state. Analysis of the simulation and comparison with experimental data requires the definition of precise features which are extracted from the simulation results. It is important to define clear goals and questions upfront, as this will guide the process of feature extraction and dimensional reduction. To this end machine learning techniques are becoming more and more popular for analysis of ABM results

FIGURE 1
We implemented a RD system (see also Supplementary Equation SB1-SB4-Equations) in an ABM to showcase results. The simulation contains two distinct cell types, which are both motile and initially randomly distributed. Cell type I (blue-shaded, white border) obey reaction equations given by a substrate-depletion system [36] and are colored by their internal concentration of the activator. Cell type II (orange) is smaller than cell type I and is chemotactically attracted by the activator which is secreted by cell type I. The background displays the density profile of the secreted activator molecule (yellow: high density, blue: low density). The number of cells I is increased from (A-D) (256, 484, 1,024, 2,025), while the number of cells II remains fixed to 3,000. Cell death reduces the overall number of agents. The pictures show the final state of the simulation after reaching (up to statistical fluctuations) a steady-state. The variations in cell number alone lead to different emerging patterns. While these results may be obtainable by a modified purely PDE-based approach, they are much easier to interpret and develop in an ABM. The simulations were carried out using PhysiCell [54].
Frontiers in Physics frontiersin.org [90,91]. Given current advancements in machine learning, researchers are hopeful that image classification of patterns and self-organizing systems can get more automated in the future [90,92]. The authors of TiSim/CellSys have explicitly suggested an image-to-model workflow [93]. Neural networks showed promise in partly replacing analysis procedures [94].
Other machine learning techniques can also be used to determine rules for agents and calibrate the model [95]. Auto encoders [96] may provide a way to obtain a dimensionally reduced representation of complex ABM simulation results. Due to the mechanistic and "law-driven" nature of ABMs, often multiple unknown parameters need to be determined or estimated from data. Parameters can be estimated by comparing features extracted from experimental data and from simulation results, which is already a substantial effort. However, this process will usually yield uncertainties, which need to be quantified, as it is not sufficient to evaluate the model locally in parameter space using a sometimes arbitrarily chosen parameter set. In order to focus on the relevant parameters, sensitivity analysis is an important tool, which can also be used for model reduction [15]. Due to highly integrated nature of ABMs sensitivity analysis is demanding and incorporates substantial computational costs [97]. Consequently, it is often only possible to arrive at qualitative statements for complex ABM simulations.

Discussion
This review introduced the concepts of agent-based models in cellular systems. We compared different frameworks with respect to their conceptual and implementational differences. To date, a large number of different agent-based model frameworks with different strengths and weaknesses exist and are openly available. The multitude of options is a clear indication for the overall interest in the subject. ABMs provide a unique tool to integrate combinations of processes and study their respective dynamics. Even for the exploration of systems that lack sufficient data, ABMs can be used as they can be developed initially with rather simplified rule sets, by means of which researchers can generate hypotheses, which can in turn guide the design of laboratory experiments. By this cycle of experimental and computational methods, the model and the experiments can be improved and finally increase the conceptual knowledge about the system. Due to this, it is important to understand the challenges of ABMs and their limitations. ABMs can be seen as a mapping of specific rules to spatial configurations. This mapping is non-unique, and the question arises, how the results of the ABM depend on the set of rules and the used parameters. How are the values (or distributions) of the parameters estimated? How does the uncertainty in the system parameters affect the predictions of the simulations? In particular, when analyzing the (often stochastic) results of a simulation, one needs to quantify the influence of the parameter uncertainty which is a considerable challenge. Besides these questions and challenges it can be expected that ABMs are quickly becoming mainstream tools in biology.

Author contributions
CF and JP devised the study. JP wrote the first draft of the manuscript. JP and CF revised, read, and approved the submitted version.

Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.