Grand Challenges in Computational Physiology and Medicine

in terms of statistical inference. Models must be able to reconstruct the data on which they are based, and this should be demonstrated clearly when publishing models. Second, models must make predictions (hypotheses) that can be tested experimentally. Model predictions may or may not be supported by results from the experimental studies they motivate. In either case, by using models to generate testable hypotheses about system function, we can better appreciate what is and is not known about the mechanisms underlying function. All models will fail at some point. Such failures can be highly informative, and modelers must be willing to communicate both the successes and failures of their models to the community. By following this approach, a closed loop can be established in which model-based hypothesis generation motivates experimental testing and model refinement. Third, a goal of modeling should be to fit into the paradigm of upward and downward pathways of causality. That is, if function does not solely reside at any one level of biological organization, then the process of developing and refining models based on data obtained at a specific hierarchical level should build toward the goal of identifying and accurately capturing the behavior of those variables which are the connection points between levels. This is the only way in which integrative (often referred to as multi-scale) models that provide deep insights into physiological function will ever be developed. Finally, models of disease, which will likely take the form of normal physiological systems models operating in a different parameter regime (Belair et al., 1995; Huang et al., 2009), must not only be reconstructive and predictive, they must guide the way to improved therapies.

in terms of statistical inference. Models must be able to reconstruct the data on which they are based, and this should be demonstrated clearly when publishing models. Second, models must make predictions (hypotheses) that can be tested experimentally. Model predictions may or may not be supported by results from the experimental studies they motivate. In either case, by using models to generate testable hypotheses about system function, we can better appreciate what is and is not known about the mechanisms underlying function. All models will fail at some point. Such failures can be highly informative, and modelers must be willing to communicate both the successes and failures of their models to the community. By following this approach, a closed loop can be established in which model-based hypothesis generation motivates experimental testing and model refinement. Third, a goal of modeling should be to fit into the paradigm of upward and downward pathways of causality. That is, if function does not solely reside at any one level of biological organization, then the process of developing and refining models based on data obtained at a specific hierarchical level should build toward the goal of identifying and accurately capturing the behavior of those variables which are the connection points between levels. This is the only way in which integrative (often referred to as multi-scale) models that provide deep insights into physiological function will ever be developed. Finally, models of disease, which will likely take the form of normal physiological systems models operating in a different parameter regime (Belair et al., 1995;Huang et al., 2009), must not only be reconstructive and predictive, they must guide the way to improved therapies.

Grand ChallenGes
Achieving the expectations regarding mechanistically based computational models of physiological function in health and disease outlined above is itself a grand "central principle of biology." Because of the inherent complexity of real biological systems, attempts to intuit their behavior using "mental models" often fail. Instead, the development and analysis of computational models based directly on experimental data is proving to be a powerful approach. We refer to this as Computational Physiologymodeling directed at achieving a quantitative understanding of the normal functional processes of an organism, organ, or system. Such models provide the foundation for understanding perturbed physiological function in disease -an approach we refer to as Computational Medicine.
Computational modeling is now a core component of many scientific disciplines including physics, chemistry, economics, psychology, public health, and most recently biology. One power of computational modeling is that the scale of models that can be formulated and analyzed far exceeds anything that can be done using more traditional paper-and-pencil mathematical approaches. This is particularly key in the extraordinarily complex realm of physiology and medicine. There is no question that understanding how function emerges as a consequence of connectivity both within and across the organizational levels of molecules, pathways, cells, tissues, and organs is the next major frontier of biomedical science, and computation will play a key role in achieving this understanding. The impact of computational medicine on society will be particularly profound. Our understanding of disease has only scratched the surface, and we will never have the ability to develop effective, personalized therapies unless we can truly understand the link between molecules and phenotypes at a quantitative, mechanistic level.
What should we expect of computational models in physiology and medicine? First, models should have a firm basis in experimental data. To gain deep insights into physiological function, models should be mechanistic rather than formulated solely

Overview
The nature of basic biological research has been transformed during the past decade. This transformation has been driven in part by development of new technologies for high-throughput data acquisition that now make it possible to sequence genomes (Metzker, 2009), and to measure RNA (Wang et al., 2009) and protein (Maerkl, 2011) expression levels with ever increasing accuracy and lower cost. These extraordinary achievements have contributed to advances in genetic medicine (Amberger et al., 2009) and the discovery of gene and protein signatures of disease (Wood et al., 2007;Hanash and Taguchi, 2010;Addona et al., 2011;Cancer Genome Atlas Research Network, 2011;Heidecker et al., 2011;Majewski and Bernards, 2011). There is, however, growing recognition that the tabulation of molecular building blocks from which biological systems are composed is not sufficient for understanding the functional properties of these systems. Indeed, function does not exclusively arise from vertical integration beginning at the level of either gene, RNA, or protein. Rather, it is becoming clear that the emergent, integrative behaviors of biological systems result from complex interactions both within and across different hierarchical levels of biological organization. As but one example, both RNA (Morris, 2011), proteins (Licatalosi and Darnell, 2010), electrical and mechanical function at the level of cells and tissue (Dolmetsch, 2003;Barlow et al., 2006;Gundersen, 2011), and external cues including environmental factors (Jaenisch and Bird, 2003;Laird, 2010) can all feed-back to regulate gene expression. This coupling between different levels of biological organization has been referred to by Noble (2006Noble ( , 2008Noble ( , 2009) as upward and downward pathways of causation. A consequence is that function cannot be considered to reside at any one level. Rather, function arises from the integrated behavior of the overall system, a concept that can be viewed as a new challenge. It is easy to say that the failure of a model to predict experimental outcomes should drive model refinement and further experimental testing. However, in many instances discrepancies between model predictions and experimental outcomes may suggest that additional biological processes must be incorporated into the model, or that some existing model component needs more precise characterization. Doing so may require the development of new experimental approaches, which can often be a daunting task. An example from the field of cardiac electrophysiology was the realization that accurate characterization and modeling of voltage-dependent membrane currents underlying electrical excitability of cardiac myocytes was not going to be possible using intracellular recordings obtained from small tissue samples (circa 1960s-1970s) due, among other things, to an inability to space-clamp cells and to prevent accumulation of potassium in the extracellular space during repeated stimulation. Rather, this characterization required that recordings be performed in isolated myocytes, a technique that was not perfected until the early 1980s. It is also easy to say that experimental results must drive the refinement of models. However, this can often be extremely challenging when models incorporate many different interacting biological processes. It is not enough that the revised model explain the new experimental data, it must also be shown that as models are refined, they continue to reconstruct the legacy data on which they were based. Techniques such as sensitivity analysis (Marino et al., 2008) or use of minimization algorithms to estimate model parameters that yield a best fit to large constraining data sets (Lillacci and Khammash, 2010) can sometimes help. However there is no guarantee that these methods will converge to a global minima when dealing with high-dimensional models.
These general aspects of physiological modeling are certainly grand challenges, however, there are also emerging research areas that deserve special mention. Noting these areas is not intended in any way to limit the scope of articles that are appropriate for this journal. Rather, these are the author's views regarding grand challenges that cut across many different areas of computational physiology and medicine.

3d systems BiOlOGy
There is compelling evidence that spatial co-location of proteins that interact with one another over nanometer length scales to perform biological functions is a common biological design motif. These regions of co-location are often referred to as "nano-domains." One example is the process of calcium-induced calcium-release (CICR) in the cardiac myocyte (Winslow and Greenstein, 2011). The structural basis of this process is that voltage-gated L-type calcium (Ca) channels (LCCs) in the cell membrane are located in direct opposition to Ca-binding Ca release channels (ryanodine receptors, RyRs) in the closely opposed junctional sarcoplasmic reticulum (JSR) membrane. These regions of opposition are known as dyads, and have depth (the bounding distance between LCCs and RyRs) and diameter of ∼10 and 50-100 nm, respectively. Opening of LCCs and influx of Ca triggers opening of RyRs and release of Ca into the small volume of the dyad. As Ca concentration rises, Ca induces inactivation of the LCCs, establishing a coupled feedforward activation and feed-back inactivation process. Given the spatial co-location of LCCs and RyRs within this nano-domain, they function as a coupled unit, with their interaction being mediated by 10 s of Ca ions, making it a noisy process. They are an example of a "protein machine" (Alberts, 1998), and their functional behavior at the nanoscale level ultimately leads to ensemble behavior at the macroscopic level that drives contraction of the heart. Even subtle disruptions of the spatial arrangement of RyRs and LCCs can have significant impact on CICR at the cellular level (Tanskanen et al., 2007). Such changes occur in heart failure, likely contributing to altered function of this protein machinery, and understanding of this altered function will be a key step forward in developing new therapeutic approaches to treating heart failure (Anderson and Mohler, 2009). Other examples of spatially dependent protein interactions within nano-domains abound. Lipid raft Ras nanoclusters with diameter of 6-12 nm, consisting of active Ras, the MAPK module and scaffolding proteins form transiently (∼400 ms) to produce a digital burst of active ERK (Harding and Hancock, 2008). Ca signaling within nano-domains localized to dendritic spines is known to activate downstream effectors such as CaMKII and cAMP-dependent pathways (Higley, 2008). To date, systems biology has largely explored "flat," graph-based patterns of component interconnections. However, the above examples show that to understand integrated systems behavior, we must consider how interacting networks of genes, proteins, membranes, and filaments are arranged in 3D space, how this determines their interaction dynamics, and how these spatial relations are altered in disease. A grand challenge is to develop methods for modeling the function of these protein machines at the nanoscale level, and mathematical and computational approaches for simplifying these models so that they may be incorporated within a multi-scale modeling framework, as described below.

multi-sCale mOdelinG
Multi-scale modeling refers to the process of modeling physiological function across multiple scales of biological organization. The power of multi-scale modeling is that it can provide insights into how system properties at the molecular level map to function at a more macroscopic level. Thus, it has the potential to form the long sought bridge between genotype and phenotype. Application areas include modeling the electromechanical function of the heart from ion channels to cells and organ (Trayanova and Rice, 2011), angiogenesis (Qutub et al., 2009), tumor growth (Hatzikirou et al., 2011), and bone remodeling (Webster and Müller, 2011), to name a few. Without question, the most ambitious effort in this area is the Virtual Physiological Human project (Viceconti et al., 2008), the goal of which is to establish a methodological and technological framework for studying the human body as a single complex system.
Multi-scale modeling poses numerous mathematical and computational challenges. Typically, they arise from the different spatio-temporal scales used to model at one level versus another, coupled with the need to combine these scales. In such cases, an appropriate succession of mathematical approximations and computational approaches must be developed in order to model across levels. An example is the work of Tanskanen and Winslow (2006) in developing multi-scale models of CICR. Since signaling in nano-domains is often mediated by small numbers of molecules, well outside the regime of the laws of mass-action, the starting point of this work was development of a model describing the motion of individual Ca ions in the cardiac dyad. To do this, the Fokker-Plank equation specifying the time-evolution of the probability that r Ca ions were at a given dyadic positions at time t was solved by discretizing this equation, yielding a large Markov chain modeling ion movement driven by Brownian motion in a potential field. This model was further simplified by assuming that Ca ions were independent and that their binding to buffers, receptors, and membranes was in equilibrium. Conditioning on a configuration of Ca sources (RyRs and LCCs) then reduced the FPE for individual ion locations and numbers to a reaction-diffusion equation. Since Ca diffusion on nm length scales was several orders of magnitude faster than channel gating, Ca ion density in the dyad equilibrated rapidly in each LCC and RyR gating state, reducing the reaction-diffusion equation to a relatively low-dimensional system of ordinary differential equations. Integration of these equations into cell models allowed realistic description of CICR in larger scale tissue and whole heart models. Therefore, the key to this multi-scale modeling approach was a series of careful approximations regarding independence of signaling molecules and successive mathematical approximations based on time-scale separation. Mathematical approaches to multi-scale model simplification are likely to vary from problem to problem, and with the specific modeling approaches employed at each level. For example, the concepts of tunable resolution and fragmentation are used with rules-based modeling, and help to eliminate the combinatorial complexity of network models (Harmer, 2009;Harmer et al., 2010). In addition, multi-scale modeling often requires development of new, fast numerical methods (for example, fast stochastic simulation methods; Xu and Cai, 2008), development of specialized highperformance computing approaches utilizing large-scale parallelization (Markram, 2006;Reumann et al., 2008), and/or specialized processor designs (Christley et al., 2010). Thus, significant challenges remain to be overcome before the use of multi-scale models becomes more widespread. Case studies in new application areas that test new theories, algorithms, and computational methods will be key to building a general approach to multi-scale modeling of physiological system function in health and disease.

Patient-sPeCifiC mOdelinG
The goal of patient-specific modeling (Neal and Kerckhoffs, 2010) is to develop computational models of pathophysiology using data from an individual patient. It has the potential to guide the delivery of treatment tailored to individual needs. Patient-specific modeling is being applied to predicting risk of vertebral osteoporotic fracture (Travert et al., 2011), 3D finite-element modeling of bone grafts (Diederichs et al., 2008), computational modeling of intra-aneurism hemodynamics (Castro et al., 2006), electromechanical modeling of the failing human heart (Aguado-Sierra et al., 2011), and tumor growth (May et al., 2011), among other applications. Most of this modeling is driven by the ever expanding ability to collect high spatio-temporal resolution patient image data that can in turn be used to generate models of the relevant structures. While the field holds great promise, it is important to select applications in which the biological properties that are critical to the predictive value of the modeling can be measured. For example, it is possible to image the geometry and motion of the heart in a patient with non-ischemic cardiomyopathy and then fit a generic finite-element mechanics model to these data. However, it is also known that there is extensive spatial remodeling of the activity of key proteins controlling electrical excitability, Ca dynamics, signaling pathways, and metabolic processes of cardiac myocytes in the failing heart. These features may be important for modeling cardiac electromechanical function in a patient-specific manner. There is as yet no method for measuring this remodeling in the individual patient. For these reasons, testing approaches for patient-specific modeling in carefully selected animal models is likely to be important for assessing the validity of these approaches.

infOrmatiCs in COmPutatiOnal PhysiOlOGy and mediCine
Almost from its inception, the highthroughput genomics community has had a culture of sharing data. Unfortunately, this culture does not yet exist within the broader physiological community. Despite the fact that the National Institutes of Health requires that data obtained in grants with more than $500,000 in direct costs per year be shared, this is almost never done. The barrier is both cultural and technical. Physiological data is diverse and complex, and the software tools for managing these data, and ontology for describing physiological experiments and data have not yet been developed. Doing so is critically important, as these data, collected at the cost of hundreds of billions of tax-payer dollars, are quite literally being lost. Developing the tools for managing, annotating, searching, and sharing physiological data is essential for the advancement of the disciplines of physiology and medicine, and is necessary for the development of quantitative models of physiological function in health and disease. One first step has been the recent proposal of a minimum information reporting standard for a cardiac electrophysiological experiment (Quinn et al., 2011) and a neuroscience investigation (Gibson et al., 2009).
There have been recent advances in creating software tools for disseminating computational models. Model sharing has been a difficult thing to achieve. It is not sufficient that model equations be published in the peer-reviewed literature because in the majority of instances, models are simply too complex to avoid errors in either equations or parameters. The CellML (Beard et al., 2009), Systems Biology Markup Language (SBML; Hucka et al., 2003), and NeuroML (Gleeson et al., 2010) have been developed as a way of addressing this problem. These languages, which are subsets of the eXtensible Markup Language (XML), support the description of model equations and parameters in machine readable form. Of course, CellML, SBML, and NeuroML model description documents must be validated on creation, but once they have been, they can be disseminated in an errorfree way and input to a number of different simulation tools for execution (Alves and Antunes, 2006;Keating et al., 2006). A number of different model repositories such as CellML.org and the BioModels Database (Li et al., 2010) have been developed that allow users to search for and download XML model descriptions. Currently, these model description languages are able to capture biological systems models that may be cast in the form of reaction networks. Developing methods by which these model