AUTHOR=Sivakumar Nikita , Mura Cameron , Peirce Shayn M. TITLE=Innovations in integrating machine learning and agent-based modeling of biomedical systems JOURNAL=Frontiers in Systems Biology VOLUME=Volume 2 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/systems-biology/articles/10.3389/fsysb.2022.959665 DOI=10.3389/fsysb.2022.959665 ISSN=2674-0702 ABSTRACT=Agent-based modeling (ABM) is a well-established computational paradigm for simulating complex systems in terms of the interactions between individual entities that comprise the system’s population. Machine learning (ML) refers to computational approaches wherein algorithms use statistical methods to ‘learn’ from data on their own, i.e. without imposing any a priori model/theory onto a system and its behavior. Biological systems—ranging from molecules, to cells, to entire organisms, to whole populations and even ecosystems—consist of vast numbers of discrete entities, typically governed by complex webs of interactions that span various spatiotemporal scales and exhibit nonlinearity, stochasticity, and variable degrees of coupling between entities. For these reasons, the macroscopic properties and collective dynamics of biological systems are generally difficult to accurately model or predict via continuum approaches (differential equations) and mean-field formalisms. ABM takes a ‘bottom-up’ approach that obviates common difficulties of other modeling approaches by enabling one to relatively easily create (or at least propose, for testing) well-defined ‘rules’ to be applied to the individual entities (agents) in a system. Evaluating a system and propagating its state over a series of discrete time-steps effectively simulates the system, allowing observables to be computed and the system’s properties to be analyzed. Because the rules that govern an ABM can be difficult to abstract and formulate from experimental data, at least in an unbiased way, there is a uniquely synergistic opportunity to employ ML to help infer optimal, system-specific ABM rules. Once rule-sets are devised, running ABM calculations can generate a wealth of data, and ML can be applied in that context too—e.g., to generate statistical measures that accurately and meaningfully describe the stochastic outputs of a system and its properties. As an example of synergy in the other direction (from ABM to ML), one can imagine using ABM simulations to generate plausible (realistic) datasets for training ML algorithms (e.g., for regularization, to mitigate overfitting). In these ways, one can envision a variety of synergistic ABM⇄ML loops. This review summarizes how ABM and ML have been integrated in various contexts that span spatial scales including multi-cellular, or tissue-level scale biology, and human population-level scale epidemiology.