A New Approximation Approach for Transient Differential Equation Models

Ordinary differential equation (ODE) models are frequently applied to describe the dynamics of signaling in living cells. In systems biology, ODE models are typically defined by translating relevant biochemical interactions into rate equations. The advantage of such mechanistic models is that each dynamic variable and model parameter has its counterpart in the biological process which potentiates interpretations and enables biologically relevant conclusions. A disadvantage for such mechanistic dynamic models is, however, that they become very large with respect to the number of dynamic variables and parameters if entire cellular pathways are described. Moreover, analytical solutions of the ODEs are not available and the dynamics is nonlinear which proves to be challenging for numerical approaches as well as for statistically valid reasoning. Here, a complementary modeling approach based on curve fitting of a tailored retarded transient function (RTF) is introduced which exhibits amazing capabilities in approximating ODE solutions in case of transient dynamics as it is typically observed for cellular signaling pathways. A benefit of the suggested RTF is the feasibility of self-explanatory interpretations of the parameters as response time, as amplitudes, and time constants of a transient and a sustained part of the response. In order to demonstrate the performance of this approach in realistic systems biology settings, nine benchmark problems for cellular signaling have been analyzed. The presented approach can serve as an alternative modeling approach of individual time courses for large systems in the case of few observables. Moreover, it not only facilitates the interpretation of the model response of traditional ODE models, but also offers a data-driven strategy for predicting the approximate dynamic responses by an explicit function that, in addition, facilitates subsequent analytical calculations. Thus, it constitutes a promising complementary mathematical modeling strategy for situations where classical ODE modeling is cumbersome or even infeasible.


INTRODUCTION
Mathematical modeling is applied in many scientific fields in order to characterize and understand the behavior of dynamical systems. In systems biology, a major aim is the establishment of such models for describing and predicting the behavior of certain processes at the cellular level in living systems. An important area of research is the investigation of cellular signaling pathways because they control the response of cells to external signals and regulate cell division, migration and cell death. Consequently, a dysfunction of these cellular signaling pathways is a major reason for many diseases.
The traditional approach for deriving mathematical models is based on translation of known biochemical interactions by applying rate laws like the law of mass action in its dynamic form [1]. This leads to ordinary differential equation (ODE) models which are termed mechanistic since each dynamic variable and parameter has its counterpart in the described process. Typically, dynamic variables represent concentrations of biochemical compounds, parameters are used for unknown initial conditions and for rate constants.
Mechanistic models facilitate explicit interpretations, e.g., unknown regulators or interactions can be postulated. Another advantage of mechanistic models is the ability to comprehensively translate existing knowledge about molecular interactions. On the downside, all relevant processes have to be included in order to obtain a realistic and unbiased model which is an elaborate task. Since each cell type features a distinct protein expression pattern which might be regulated in a growthand environment-specific manner, the set of relevant molecular interactions is typically unknown. Therefore, proper model structures have to be deduced from experimental data and a large number of parameters have to be estimated. To this end, in order to establish an ODE model a rather large and comprehensive amount of experimental data is a critical prerequisite, even if the major focus of a study is on a few compounds or merely an input-output relationship.
Another disadvantage of ODE models is the fact that they cannot be integrated analytically and that their solutions depend non-linearly on parameters. As this impedes analytical mathematical calculations, sophisticated numerical tools become necessary for their analyses. Up to now, the typical model size in the case of data-based modeling is in the range of 10-40 dynamic compounds and 10-50 dynamic parameters [2]. The currently available methodology might be able to handle models with up to around 10 2 -10 3 parameters or model states. However, the availability of single-cell data as well as the attempt to establish whole cell, tissue, organ, or even whole body models demands for complementary modeling techniques that can be applied for large systems and are capable of integrating different temporal and spatial scales.
In the scientific literature, several approaches for function estimation can be found. On the one hand, there are nonparameteric regression methods that do not rely on the specification of a model structure, for instance smoothing approaches [3] like smoothing splines [4], Gaussian processes regression [5], kernel regression [6] or locally weighted scatterplot smoothing [7]. On the other hand, there are parametric regression approaches that require the specification of a model. Prominent examples are polynomial regression [8], all non-linear regression techniques [9], and regression based on fractional polynomials [10].
In systems biology, some approaches have been suggested that can be applied instead of ODE models in order to describe and investigate the dynamics of biochemical interactions or as approximations of ODE models. Gaussian processes have been proposed, e.g., for learning unknown differential functions [11] or for inference of latent biochemical species [12]. In addition, Boolean models and representations of cellular signaling networks have been introduced [13,14]. Such Boolean models can be transformed into continuous ODEs and thereby serve as approximations [15]. In Liu et al. [16], an approximation of ODEs by dynamic Bayesian networks (DBNs) has been suggested that enables parallelized simulations on graphical processing units (GPUs). Moreover, fuzzy logic models were suggested for modeling of the dynamic response of cellular signaling pathways [17].
In this manuscript, a novel modeling approach is presented that is based on curve fitting of a non-linear explicit function in order to describe the dynamics of biochemical networks. For transient responses as typically observed for cellular signaling pathways, it will be demonstrated that the approximation error, i.e., the difference between the suggested function and an ODEbased model, is much smaller than common measurement errors. The described function can be readily applied to integrate the approximate behavior of ODE systems into multi-scale models. It also offers a possibility to fit input output behavior of a system of interest in a less elaborate manner by a phenomenological, non-mechanistic mathematical description.

METHODOLOGY
In section 2.1 of this chapter, the traditional ODE-based modeling approach is briefly summarized. Then, a transient function (TF) is introduced in section 2.2 which can be applied to describe immediate responses. In order to describe delayed responses as commonly observed for signal transduction models, the retarded transient function (RTF) is introduced in section 2.3 For simplicity, hereinafter, I will not distinguish between mathematical symbols representing scalars or vectors.

Traditional ODE Modeling
The dynamicsẋ(t) of concentrations x ∈ R n x of molecular compounds can be modeled using ordinary differential equationṡ (1) which can be derived from known or hypothesized biochemical interactions based on the rate equation approach [1]. Unknown initial values θ x : = x(0) ∈ R n x and rate constants θ d ∈ R n θ d are termed as dynamic parameters. u denote externally controlled inputs [18] such as the stimulation with a ligand or knockdowns and might either be represented by a scalar or a time dependent function on the right hand side of (1). For all realistic systems biology applications, the ODEs (1) cannot be integrated analytically. Therefore, there exists only a numerical solution that can be calculated by generic solvers for initial value problems, for instance from the SUite of Nonlinear and DIfferential/Algebraic equation Solvers (SUNDIALS) [19]. θ D ∈ R n x +n θ d aggregates all dynamic parameters, i.e., unknown rate constants θ d and initial values θ x . Since the concentrations x(t) of molecular compounds usually cannot be measured directly, observation functions O i (x(t i ), θ O ) containing additional observation parameters θ O ∈ R n θ O such as offsets or scaling factors have to be introduced in order to link data points to dynamic states. In this notation, i = 1, . . . , N data is the index of the data point y i and each data point has its respective observation function O i , measurement time t i , input functions u i and measurement error ε i . If the system is evaluated for several distinct inputs or initial conditions, the ODEs have to be integrated for each combination individually. Hereinafter, these combinations are referred to as conditions.

The Immediate Response Function
After stimulation, typical non-oscillating response curves consist of two major components: a sustained, i.e., saturating permanent contribution, and a temporary fraction. Such immediate responses can be mathematically described by the sum of a sustained term f sus (t) with a time constant τ 1 and a transient term f trans (t) with two time constants τ ′ 1 and τ 2 . Moreover, an offset parameter p 0 is used. The transient function f TF described by (4) has six independent parameters The parameters τ 1 , τ ′ 1 have common meanings in terms of time constants, e.g., if the system is stimulated at t = 0, the system has reached half of the maximal sustained response at t = τ 1 .
The second term f trans (t) accounts for transient, i.e., decaying up-or down-regulation with amplitude A trans . The transient response relaxes with a second time scale τ 2 toward zero. The last term, i.e., parameter p 0 represents a constant offset. While both amplitudes A sus , A trans and the offset p 0 can have a negative sign, the times scales are strictly positive, i.e., τ 1 , τ ′ 1 , τ 2 ∈ R >0 . Figure 1A illustrates (4) as well as the decomposition into sustained and transient parts. Function f TF describes immediate responses at time-point zero and can be applied if an immediate response after stimulation at t = 0 is supposed.

The Retarded Transient Dynamics
Depending on the application setting, the transient dynamics of the modeled compounds do not immediately set in at t = 0 but with some delay. I therefore suggest to account for this by applying a non-linear transformation t(t real ) = log 10 (10 t real + 10 T shift ) − log 10 (1 + 10 T shift ) (6) of the real measurement time-axis t real with a single parameter T shift . This transformation is of the following qualitative form: where b denotes the basis of the logarithm. In addition, the two constants c 1 : = b T shift and c 2 : Rewriting the transformation (6) as (7)- (9) shows that the suggested transformation basically corresponds to a linear transformation of the time axis at the log-scale. As logarithm base, b = 10 in (6) is chosen. The transformation (6) is illustrated in Figure 1B and shows that the time shift T shift in the suggested parametrization determines the location of the kink. This position of the kink is the time point for the real time axis t real , where the response time becomes linear, i.e., where the immediate response (4) kicks in. Thus, it can be interpreted as the response time.
The curvature of the kink in the transformation (6) is depend on t real and therefore, the transformation is not invariant with respect to unit transformation t real → const. × t real (10) of the time axis. Thus, (6) is modified by rescaling with the observed time interval t range = max(t real ) − min(t real ) to the time interval [0,10], as depicted in Figure 1B. The transformation of the time axis, then finally reads t(t real ) = log 10 (10 t real ×10/t range +10 T shift )−log 10 (1+10 T shift ). (11) Plugging the non-linear time transformation (11) into the immediate response function (4) yields the retarded transient function with θ RTF : = {A sus , A trans , τ 1 , τ ′ 1 , τ 2 , p 0 , T shift } which will be applied in the following to approximate solutions of ODEs which in the systems biology setting often have transient temporal shapes. Figure 1C illustrates how the shape of f (t real , θ RTF ) depends on the response time parameter T shift : the larger the response time parameter T shift , the later the response. The colors coincide with Figure 1B. Finally, Figure 1D illustrates the transient and sustained parts of the response for T shift = 2.

Fitting to ODE Solutions
In systems biology, ODE modeling is currently the state-ofthe-art approach for the mathematical modeling of regulatory networks. The RTF approach (11), (12) can complement the ODE approach in two ways: first, by applying the RTF approach to already existing ODE models in order to provide an approximating mathematical representation which is easy to handle computationally, for instance for integration into a multiscale model. Second, the RTF can be used instead of an ODEbased model. For both purposes, it is important to assess the abilities of the RTF for approximating ODE solutions.
The abilities of the RTF (11), (12) for approximating the dynamics of ODE models are evaluated by fitting the function to a component x j of the ODE solutions. For this task, leastsquares estimation (13) was performed. Moreover, the root-mean-square error is calculated. Finally, the scaled RMSE is interpreted as an approximation error.

Fitting to Data
For experimental data, the parameters of the transient are estimated by maximum likelihood estimation. For data with independently distributed Gaussian errors with variances σ 2 i , the likelihood is given by Frontiers in Physics | www.frontiersin.org Then, the negative log-likelihood is (18) As a result, maximum likelihood estimation coincides with least-squareŝ where the least-squares objective function (21) is minimized for parameter estimation.

Reducing Overfitting
If a small number of data points is available, it is essential to prevent overfitting. Overfitting is excessive adaptation of the model and its parameters to measurement errors [20]. In the following, three strategies for diminishing the overfitting problem are suggested.

Parameter Constraints
To prevent overfitting which is recommended if a limited amount of data is available, the same time constant τ 1 ≡ τ ′ 1 for induction of the transient response and for the sustained response can be assumed. We used the resulting five-parameter transient function TF in Lucarelli et al. [21] previously for analyzing the transcription of TGFβ target genes. In that application, the analyses were performed for pairs of time-courses for treated and untreated cells and the resulting time dependencies were used for clustering of genes.
Moreover, it is possible to specify bounds for the parameters as a restriction to reasonable ranges. These bounds should be independent of physical units, i.e., they should be invariant under scaling of the data or sampling times. Technically, bounds are also required for making reasonable initial guesses for the numerical optimization of equation (21). Moreover, setting parameter bounds strongly improves the convergence behavior in the course of numerical non-linear least squares optimization [22].
Assuming that the sampling times t real 1 , . . . , t real N of the data points used for fitting were reasonably chosen, the minimal sampling interval min i (t real i+1 − t real i ) can be utilized to define a lower bound for the time scales. In addition, the amplitudes can be restricted based on the range (max(y) − min(y)) of the data points y = {y 1 , . . . , y N }. Likewise, the data offset can be constrained by the minimal and maximal data point. If overfitting is a minor issue, these bounds can be relaxed, e.g., if existing ODE models are approximated and these bounds turn out to be too stringent.

Model Reduction
The second strategy for diminishing overfitting is reduction of the number of fitted parameters in an automatic data-driven manner. This means that the complexity of the model is reduced because the available data can be sufficiently described with a simplified model. Such a data-driven simplification of the model equations has been termed model reduction [23].
For model reduction of the RTF and, in turn, decreasing overfitting issues, the likelihood ratio test is utilized. The likelihood ratio test is a well-established statistical test for assessing whether additional parameters yield a significant improvement [24]. The likelihood ratio test is interpreted in a reverse manner as it is commonly done in backward elimination procedures [25]. As test statistic, the log-likelihood ratio of two models is evaluated which in my notation corresponds to where χ 2 (1) denotes the least-squares objective function of the reduced model and χ 2 (2) denotes the objective function of the full model.
Here, the following backward elimination procedure is suggested: 1. Testing whether there is time retardation, i.e., if T shift parameter is significantly different from the lower bound. If not significant, T shift is set to the lower bound which is T shift = −2 by default. 2. Testing whether the model is in agreement with a constant. If not significant, we set T shift = τ 1 = τ 2 = A sus = A trans = 0. 3. Testing whether the offset p 0 is significantly different from zero. If not significant, we set p 0 = 0.
Application-specific prior knowledge can be utilized on top of this. As an example, if the data is pre-processed and the minimal value is subtracted, the offset parameter p 0 might be omitted in advance. If the responses are, furthermore, known to be strictly positive, the lower bounds of the amplitudes A trans and A sus can then be set to zero.

Enhancing and Attenuating Responses
The RTF according to (12) allows amplitudes A sus and A trans with positive or negative signs. In many application settings, however, it might be a reasonable assumption, that the sustained and transient part of the response have the same direction, i.e., are both either enhancing, or both attenuating. This presumption also reduces the amount of overfitting and often yields much more realistic outcomes. It can be achieved by restricting both amplitudes to positive numbers and introducing a sign-constant. If such a model is fitted, two optimization runs are required for both signs and the better fit is selected.
Since the sustained and the transient part of the response are usually triggered together, τ 1 ≡ τ ′ 1 is assumed by default. The times t ′ = 10 t real /trange denote the rescaled experimental measurement times to the interval [0, 10].

Implementation
The RTF approach can be implemented in any modeling or curve fitting software based on equations (11) and (12). An implementation of the presented RTF approach is provided within the Data2Dynamics (D2D) modeling environment [26] which is freely available. In order to guide custom implementations, some important technical aspects of my implementation are discussed in the following.
Since the transient function is non-linear, −2 log L(θ ) is not convex and local optima can occur. Therefore, multi-start optimization [27] is suggested and has been performed to find the global optimum, i.e., the best possible fit. For fitting the six parameters of the RTF, ten random initial guesses were drawn which turned out as sufficient for finding the global optimum based on visual inspection of the outcomes.
In order to fit responses with positive and negative signs, two subsequent fits are required. Moreover, model reduction as described in section 2.6.2 might be desired. For multi-start optimization, both tasks have to be performed for each initial guess. In D2D, this functionality has been implemented in arFitTransient.m. If D2D's standard multi-start optimization function is intended to be utilized, the standard function in Advanced/arFits.m has to be replaced by TransientFunction_library/arFits.m by modifying Matlab's search path properly.
Further implementation details and examples for applying the RTF approach are provided at the website of the D2D github repository https://github.com/Data2Dynamics/d2d/wiki/RTF.

RESULTS
In this chapter, in section 3.1 it is first demonstrated that the retarded transient function (RTF) can accurately describe the dynamics of ODE models as they commonly occur in systems biology. Specifically, two aspects will be addressed. First the abilities for accurate approximations are investigated and proven which is a prerequisite for applying the RTF as mathematical modeling approach directly to data. Second, it will be demonstrated that established ODE models can be approximated if required, e.g., as components of a multi-scale model, which might require explicit functions for technical reasons.
In section 3.2, the RTF approach is then evaluated and compared with ODE models for predicting the dynamics from measurements. For this purpose, simulated data is generated and the accuracy of the estimated time courses is assessed by the comparison to the known dynamics ("ground truth").

Approximation of ODE Models
The dynamics of biochemical interactions can be translated into ODE models in order to describe the time-dependency of the concentrations of molecular compounds. In this publication, the accuracy of the RTF approach for modeling of such time courses is assessed based on a subset of nine benchmark models [2] that do not comprise discontinuities, i.e., so-called events [28]. The ODEs published in the original papers are application specific special cases of (1). dynamic parameters and describes eleven dynamic variables in seven experimental conditions. 9. Finally, the Swameye model [37] has been published to demonstrate shuttling of STAT5 from the nucleus back to the cytoplasm. This model has ten dynamic parameters and contains twelve dynamic variables in a single experimental condition. It describes the activation and dimerization of STAT5 upon EPO treatment as well as the translocation of dimers between cytosol and nucleus.
The retarded transient function (RTF) was fitted to the dynamics of the ODE solutions of all these models resulting in 797 fits in total. In addition to a single fit using the initial guess specified in Table 1, multi-start optimization with ten random initial guesses was performed for approximation errors larger than 5%. Figure 2 shows representative examples in panel (A) as well as worst case scenarios (B) and the overall distribution of the approximation error. In 83.8% of all evaluated cases, the approximation error, i.e., the normalized RMSE (15) between RTF and x(t) is smaller than 5%. In 58.3% of cases the approximation error is even smaller than 2% and the difference between the RTF approximation and the ODE solution is hardly visible.
In order to assess the quality of the RTF approximation, it should be kept in mind, that accuracy and reproducibility of experimental data is a major limitation and characteristic of cell biology. This is due to the fact that living cells with individual and unique histories and attributes are investigated. Depending on the cell system, the experimental conditions and the measurement technique, relative errors of 10-20% are common [38]. Moreover, in many applications a minimum concentration change by a factor of 1.5 is requested as a minimum effect size with biological relevance. In light of these aspects, the RTF approach exhibits a convincing performance for describing the major shape characteristics.
All 797 fits are available as Supplementary File 1 to demonstrate the abilities of the RTF approach for the approximation of the typical dynamics of cellular signaling models. Since a few cases where optimization of the RTF parameter did not converged to the global optimum are suspected, the agreement could even be improved by increasing the number of random initial guesses. Moreover, the bounds specified in Table 1 could be relaxed in order to further improve the agreement between ODEs and RTF. However, since this analysis is used as a foundation for the analysis of experimental data, the same bounds as in the next section were chosen at this point.
In some few cases, the RTF approach has limited performance, e.g., for time courses with multiple response peaks. Panel (B) shows five worst case scenarios resulting in approximation errors between 9% and 11%. For the second plot, "Bachmann EpoRpJAK2 Conditon1, " the ODE solution has two distinct response peaks, but the fitted RTF can only describes the first peak. The ODE model shown in the third plot of panel (B), "Crauste EarlyEffector Condition1, " is initially zero and then shows a rather quick and strong peak around t = 10 min. This peak is again only roughly approximated by the RTF with an overall approximation error of around 10%.
The forth and fifth plots in panel (B), "Swameye npSTAT Condition1" and "Swameye pSTAT Condition1, " exhibit a second delayed response which cannot be captured by the RTF. For pSTAT activation "Swameye pSTAT5 Condition1, " the RTF approach fails to describe an initial fast and short activation. This result could be improved by using additional prior knowledge, i.e., by utilizing the fact that prior to stimulation, pSTAT is known to be zero and only positive response amplitudes A sus and A trans are biologically reasonable. All trajectories of the Swameye model contain a second, more or less pronounced response which originates from the shuttling of STAT5 through the nucleus and cannot be closely approximated by the RTF approach. In the original publication [37], this shuttling process has been described by a delay differential equation (DDE), i.e., a model which is not within the primary scope of this paper. Later, the delay was approximated in Schelker et al. [18] by an ODE model utilizing the so-called linear chain trick [39]. This model version was published in Hass et al. [2] as a benchmark model and therefore analyzed in this manuscript. Taking these circumstances into account, it is not surprising that the RTF exhibits limited capabilities. On the other hand, this model nicely exposes some limitations of the RTF approach.

Fitting of Data
In the previous section, it has been demonstrated that in most cases the presented RTF approach can accurately approximate the dynamics of pathway models based on ODEs. In the current section, the capabilities for the fitting of data is evaluated. Compared to traditional data-based ODE modeling, this approach is a complementary and convenient modeling strategy especially for cases where either too little data is available for building a comprehensive network model, or in situations where only a single or a few outputs are of interest. FIGURE 2 | The dynamics of 797 numerical ODE solutions x(t) from the nine different benchmark models were fitted by the RTF in order to assess and illustrate the capacity of the RTF (11), (12) for approximating ODE solutions (black lines). The underlying equations of the ODE models originate from the original publications and were taken from the benchmark collection [2]. The fitting has been performed based on (13), (18)- (21). The histogram in the center shows the distribution of the resulting approximation errors (15). These errors are also plotted as gray bands and their magnitudes are indicated in the figure legends. In more than half of all fits, the difference between both approaches is hardly visible. For the other cases, the RTF most of the time describes the dynamics qualitatively. (A) Shows typical scenarios for approximating ODE solutions by the RTF. Limitations of the approach are pointed out in (B). Here, five representative fits with large approximation errors are plotted.
In systems biology, the availability of experimental data is usually strongly limited because analysis of living cells is elaborate and expensive. Time course data, as an example, rarely comprise more than ten time points. However, the dynamics is usually evaluated for several "conditions, " e.g., for multiple stimulation strengths and/or different genetic backgrounds. These experimental conditions correspond to distinct initial conditions or parameters in the ODE model. The Bachmann model, as an example, contains 23 experimental conditions. These conditions represent different combinations of 14 doses of ligand treatment, with/without Actinomycin co-treatment of wild type cells or cells where CIS or SOCS3 is over-expressed. In order to evaluate the abilities of the RTF approach on a wide variety of possible dynamic shapes, all combinations of these conditions were evaluated, i.e., all 14 EPO doses for all eight wild-type/over-expression settings.
For each of the resulting 112 conditions, a single time course data set was simulated with 10 equidistant different time points between 0 and 100 min for one out of all 25 dynamic variables x. In total, this results in 2,800 simulated data sets which are individually fitted using the ODE model and the RTF. As measurement error, a typical relative error of 10% was assumed. If 10% was below 1e-5, the standard deviation was set to 1e-5.
For fitting the ODE model, the true parameter values that were used for simulation were also taken as initial guess. Moreover, a local multi-start optimization with ten different starting points around the true parameters was applied. For this purpose, each parameter vector was randomly perturbed by adding uniform numbers from the range [−1, 1] at the log 10 -scale. Figure 3 shows typical outcomes for the ODE model and the RTF approach in panel (A). RMSEs (14) between the fitted time course and the time course for the true underlying ODE model were calculated for both approaches for the fine time grid with 300 time points which was also used for plotting the solid lines. In order to obtain comparable numbers for all fits, the RMSEs were divided by the mean measurement error over ten simulated data points. Histograms and boxplots of the scaled RMSEs are displayed in panel (B) and (C). The ODE model, which has been used to generate the data, shows a slightly superior performance, i.e., smaller RMSEs. The median of the scaled RMSE is 0.465 for the ODE model and 0.608 for the RTF. If constant time courses with RMSEs close to zero are omitted, the median is 0.662 for FIGURE 3 | (A) Shows six typical outcomes for fitted RTF models (blue lines) (11), (12) and ODE models (brown lines) in order to compare the outcomes of both modeling approaches for fitting of data. Fitting has been performed based on (13), (18)- (21). The black lines represent the solutions of the ODE equation models which are taken from Hass et al. [2] for published parameter values and presumed as true dynamics for the simulation of the data. In most cases, both approaches provide estimated curves which are close to the true dynamics and the performances coincide. Moreover, there are some examples, where either one or both approaches provide biased outcomes. (B) Shows the resulting RMSEs (14) as histograms. All values above 10 are plotted in the rightmost bin. On average, the ODE approach results in smaller RMSEs but the RTF is only slightly inferior. This qualitative outcome is confirmed by the boxplot shown in (C). It shows that the differences between the individual fits are much larger than the average performance loss of the RTF.
the ODE model and 0.926 for the RTF approach. In both cases, the median performance benefit of the ODE model is by a factor of 20-40 smaller than the differences between individual data sets/time courses because the standard deviation of the RMSEs over all fits is 5.26 for the ODE model and 5.44 for the RTF.
For the same analysis, Figure 4 shows the scaled RMSE as a scatterplot. The RMSEs of both approaches are correlated. All points above the diagonal have larger RMSEs for the RTF approach. For all points below the diagonal, the RTF approach outperforms the ODE model. Figures 4A-J illustrate extreme cases, i.e., scenarios where both approaches have small or large RMSEs, or fits where one approach strongly outperforms the other. This illustration, again, confirms that the average difference in performance is much smaller than the difference between individual fits.
Supplementary File 2 provides the results for all 2,800 fits of the RTF without the true dynamics and without the fitted DE model. This file can be inspected in order to obtain an unbiased impression about the fitted dynamics of the RTF approach. According to the subjective appraisal of the author, the results are very close to solutions which would be guessed by an experienced user. In Supplementary File 3, the same fits are plotted together with the true and fitted ODE solutions which illustrates the underlying dynamics as well as the results of both modeling approaches.
Nevertheless, on average the ODE model proves to be advantageous due to the knowledge and utilization of the model structure which has been used to generate the data. This benefit, however, is a factor of 20-40 smaller than the difference between individual fits and might therefore be acceptable in practice. Moreover, in real application settings the advantage of the ODE model is even smaller because in this case the underlying regulatory network is unknown and is also an imperfect approximation of the real biological process. Based on these findings, the RTF approach seems to be a very promising complementary mathematical modeling strategy for settings where comprehensive data, which is required for establishing a comprehensive pathway model, is not or not yet available.

DISCUSSION AND SUMMARY
Mechanistic ordinary differential equation (ODE) models are the state-of-the-art approach for modeling cellular processes because models can be defined by translating interactions between cellular compounds. The Biomodels database [40,41], FIGURE 4 | Comparison of both, the fitted RTFs (11), (12) and the fitted ODEs from Bachmann et al. [29] to simulated data with published parameter values. Again, fitting has been performed based on (13), (18)- (21). The scatterplot in the middle shows RMSEs (14) between fitted and true dynamics (black lines) for both modeling approaches. For all points above the diagonal, the ODE model outperforms the RTF. However, there are also several examples where the RTF result in smaller RMSEs. Ten fits for extreme scenarios are plotted in (A-J).  for instance, currently contains 762 ODE models, i.e., more than 83% of the models which are assigned to a modeling approach are classified as an ODE model. Since for such models each model component has its counterpart in the real biological process, this approach provides strong abilities for interpretations and also for predictions of the system's behavior.
In the presented study, the ODE model has been assumed as the truth, e.g., data has been generated based on published ordinary differential equation models. Hence, ODE modeling has been considered as gold-standard and the fitted outcomes of the ODE models have to outperform alternative modeling approaches since all available knowledge, i.e., model structure and data is entirely exploited.
One drawback of ODE-based models is the need for rather large and comprehensive amounts of experimental data for calibration of the unknown model parameters. This is true even if only a few, specific components are of interest, for example the input-output behavior of a signaling pathway. Moreover, elaborate model discrimination procedures are required in order to find a proper model structure. In addition, the models typically increase in size because all relevant states, compounds and interactions have to be considered. Model reduction [23,42,43] is a non-trivial task and since usually all dynamic states are coupled, reduced representations describe the dynamic behavior only in an approximate manner.
Some alternative modeling approaches have been suggested as a replacement for ODE models, e.g., if the system is too large, if only a specific input-output behavior is of interest, or if the amount of available data is too small. However, all these approaches have serious limitations. Logical models, as an example, describe the input-output behaviors only in a discrete manner. Constraint-based models neither provide unique predictions nor statistical confidence intervals. Traditional functional estimation methods, like smoothing splines or polynomials are not tailored to the shape of typical temporal responses. Such approaches require densely sampled time-courses in order to provide realistic curves. Moreover, tuning parameters that control the amount of smoothing are difficult to choose. Therefore, these approaches are not yet available as applicable tools for the modeling of the dynamics of cellular compounds.
In contrast, the retarded transient function (RTF) approach suggested in this publication is tailored to typical dynamic response curves observed in cellular signaling pathways. Some basic differences between the RTF approach and ODE-based models are summarized in Table 2. The parameters of the RTF can be nicely interpreted as response time and amplitudes and time constants of a transient and a sustained response fraction. Based on nine benchmark models, it was possible to show that the suggested function can accurately describe the dynamics of solutions of ODE models. Moreover, for fitting simulated data, it could be shown that in terms of root-mean-squared-errors the fitted RTF is almost as good as the true underlying ODE model. Figure 5, the presented implementation allows for fast and simple evaluations of the systems dynamics. First, the RTF is fitted to either an existing ODE model or to experimental data. In a second step, the fitted functions are translated to executable Matlab functions. Those functions, then, can be conveniently called for evaluating the time dependency of the concentration of a dynamic compound. This facility is very valuable for integrating mathematical models of cellular processes into multi-scale models, i.e., computational models describing the behavior at the tissue-, organ-or even whole body level based on the processes at the level of individual cells which is currently a major methodological challenge in systems biology [44,45].

As summarized in
Of course, the presented approach also has its limitations. On the one hand, the RTF approach does not provide mechanistic insights. On the other hand, the scope of dynamic shapes is restricted. By construction, the RTF cannot describe several peaks or oscillations and the RTF approach should, therefore, only be applied if such features do not occur or do not play a prominent role. If the RTF is applied for the approximation of existing ODE models, the performance can be evaluated and tested, e.g., in terms of the RMSE. The RTF might then turn out to be inappropriate for ODE models which exhibit an oscillating behavior, delays, or discontinuous dynamics subsequent to multiple perturbations. If the RTF approach is applied for the fitting of data, the measurements should neither be too sparse nor should they be sampled at improper time scales in order to reduce the risk of biased estimation or that important dynamic features are not captured by the model.
The problem of biased estimation and the risk for wrong predictions, however, is a general issue of all modeling approaches, especially in the case of small amounts of experimental data. In my analysis, the median performance loss was 20-40 times smaller than the standard deviation between different fits. The presented RTF approach tends to produce smooth outcomes because only a single maximum/minimum can be captured. This seems to be a reasonable solution for the general trade-off between bias and variance, since the predicted dynamics tend to be plain instead of hypothesizing intricate temporal behavior.
Some tasks remain and are intended to be tackled in the future. First, a theoretical justification for the amazing abilities of the suggested RTF approach is desirable. For this purpose, the class of ODE systems which are exactly solved by the RTF or where the RTF occurs as approximate solution, e.g., for calculations of perturbations, has to be derived and linked to classical rate laws.
Another task consists of linking multiple experimental conditions which originate from different initial conditions, e.g., doses of a ligand. In the current formulation, each initial condition is treated independently and fitted RTFs has distinct and unrelated parameters. Instead, one could describe the dose dependency of individual parameters of the RTF by other explicit functions, e.g., by hill equations which are frequently applied for describing dose dependencies in systems biology [46]. This would then allow for predictions of the time and dose dependency using explicit functions. As a final task, the potential of the RTF approach for multi-scale modeling has to be investigated in order to evaluate its applicability and suitability.

CONCLUSION
A major goal of systems biology is the establishment of mathematical models that describe the behavior of living cells at the level of molecular compounds. Here, the retarded transient function (RTF) has been introduced as a curve fitting approach which is tailored to dynamic modeling of cellular compounds. It has successfully been shown that the approach both provides accurate estimates of the underlying dynamics and facilitates valuable interpretations and, thus, serve as a valuable complement of the traditional ODEbased modeling.

DATA AVAILABILITY STATEMENT
The analyzed benchmark models are available at https://github.com/Benchmarking-Initiative/Benchmark-Models including the analyzed data. The analyses were performed using the Data2Dynamics (D2D) modeling environment which is available at https://github.com/Data2Dynamics/d2d. In D2D, the RTF approach is available including toy model examples in the download folder Examples/ToyModels/TransientFunction. Basic functions including the analyses made for the paper are described in the online wiki of the D2D framework https://github.com/Data2Dynamics/d2d/wiki/RTF.

AUTHOR CONTRIBUTIONS
CK designed the approach, performed the analyses, and wrote the manuscript.