A Robust and Efficient Numerical Method for RNA-Mediated Viral Dynamics

The multiscale model of hepatitis C virus (HCV) dynamics, which includes intracellular viral RNA (vRNA) replication, has been formulated in recent years in order to provide a new conceptual framework for understanding the mechanism of action of a variety of agents for the treatment of HCV. We present a robust and efficient numerical method that belongs to the family of adaptive stepsize methods and is implicit, a Rosenbrock type method that is highly suited to solve this problem. We provide a Graphical User Interface that applies this method and is useful for simulating viral dynamics during treatment with anti-HCV agents that act against HCV on the molecular level.


INTRODUCTION
Approximately 71 million people worldwide are affected by chronic hepatitis C viral (HCV) infection, which is the primary cause of liver cirrhosis, liver cancer and liver transplant [1]. Approximately 400,000 people die each year from HCV, mostly from cirrhosis and hepatocellular carcinoma [2]. There is no vaccine for HCV and for more than a decade the standard-of-care of pegylated interferon-alpha (IFN) and ribavirin was suboptima [3].
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
However the recent advent of direct-acting antivirals (DAAs) allows for interferon-free, alloral treatment yielding cure rates exceeding 90% with pangenotypic activity and shorter durations of therapy (8)(9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24) weeks) compared to IFN-based therapy (24-48 weeks [4]). While these highly effective DAAs are considered one of the greatest achievements in medicine, significant challenges remain for eliminating HCV infection such as finding an optimal approach to current DAA failures, preventing re-infection, identifying all those infected and the high cost of the new DAAs which represents a major barrier to treating the populations that are most affected by HCV [3]. Thus, there exists a continuing need for more affordable therapies, as well as an effective vaccine [5].
Mathematical models are valuable tools for understanding the in vivo serum dynamics of viruses that trigger both persistent infection (e.g., HIV-1 [6][7][8][9], hepatitis B virus [10][11][12], hepatitis D virus [13][14][15], Theiler murine encephalomyelitis virus [16], herpes simplex virus [17] and HCV [18][19][20]) and acute infection (e.g., influenza A [21][22][23] and ebola [24]). Mathematical modeling is also improving our understanding of intracellular viral genome dynamics [25][26][27][28] and the quantitative events that underlie the immune response to pathogens [6,9]. The standard model for HCV kinetics during treatment provided many insights into the effectiveness and mechanism of action of IFN and ribavirin (reviewed in [29,30]). This model has been able to retrospectively predict the duration of treatment needed for HCV eradication (cure) [31][32][33][34][35] and more recently was used in real-time (on treatment) to predict the duration of therapy needed to achieve cure with an IFN-free regimen of silibinin +ribarivin [36]. In the age of DAAs, new models are being developed to meet the challenge associated with these new agents [37][38][39]. Notably, the first age-based multiscale mathematical model for HCV kinetics has been developed [25,40,41] providing a more comprehensive understanding of viral treatment response kinetics observed in patients treated with IFN, HCV protease inhibitors (telaprevir and danoprevir), or the HCV NS5A inhibitor daclatasvir as well as modes of action of these drugs.
The aforementioned model is an extension to the classical Neumann et al. biphasic model [20] that was introduced in 1998 and treated the infected cell as a "black box," producing virions but without any consideration of the intracellular viral RNA replication and degradation within the infected cell [26,27,42]. The biphasic model is a set of three ordinary differential equations (ODEs) with three variables: uninfected target cells (T), productively infected cells (I), and extracellular virus in blood (V). The multiscale model considers the intracellular viral RNA as an additional equation for the variable (R), with the introduction of age-dependency and time-dependency, making it a partial differential equation (PDE)model. When this multi scale model is used to study the dynamics of HCV infection under therapy with DAAs it includes both intracellular viral RNA replication/ degradation and extracellular viral RNA (i.e., virus particles) with age-dependency and timedependency. As such, it is considerably more difficult to solve compared to the standard biphasic model. Previously short-term and long-term analytical approximations were derived [25,41,43]. In the short-term approximation, it was assumed that after therapy is initiated the infected cells maintain their steady-state levels of HCV, whereas in the long-term approximation all new infections after the onset of therapy are neglected. While the shortterm approximation has been shown to be precise only in the first half-day of treatment, the long-term approximation is in agreement after several days post-treatment initiation with a simple numerical solution that utilized a canned solver (an ODE solver used in higher level languages such as Matlab and Mathematica, or Python) [41].
In this paper, we provide a robust and efficient numerical method for solving the multiscale model. The goal is to considerably improve the numerical solution presented in Rong et al. [41] that used a canned solver (an ODE solver used in higher level languages such as Matlab and Mathematica, or Python), making the numerical solution a flexible and robust entity alongside the analytical approximations. As it turns out, because of the properties of this multiscale model and the fact that the differential equations are stiff, some advanced numerical methods that involve adaptive stepsize are needed. To begin with, the use of a canned solver should be replaced with a full-fledged solver because of the additional integral introduced in the multiscale model for the variable V that needs to be computed at each time step. Unlike the construction of numerical schemes in other applications, for example in the non-linear diffusion of digital images [44][45][46] where accuracy can be limited, herein it is adviseable to construct a stable and efficient scheme that belongs to the Runge-Kutta family with at least a fourth order of accuracy. However, due to the nature of the differential equations that are stiff and the additional integral that needs to be evaluated at each time step, implicit solvers with adaptive stepsize are considerably more stable and efficient than the standard Runge-Kutta fourth order method. We implement implicit schemes with adaptive stepsize [47] that are highly efficient and stable for use in the multiscale model with age of hepatitis C virus dynamics.
The main contribution of this manuscript is in the use of the Rosenbrock method to solve the pre-existing multiscale model of hepatitis C virus (a system of PDEs), and to provide an open-access graphical user interference (GUI) to solve and simulate the model numerically.
To the best of our knowledge, this is the first numerical simulator with a graphical user interface developed for the multiscale model. In addition, a detailed presentation of the Rosenbrock method is provided, as well as updated results. The paper is organized as follows. Section 2 provides the mathematical background for the model. In section 3, we provide the details of the Rosenbrock method including a step-by-step derivation starting from the standard Runge-Kutta fourth order, describing how we applied it to solve our problem. Section 4 presents and discusses the results achieved with our implementation of the Rosenbrock method, culminating with a description of the multiscale model simulator that we developed and is provided for general use. We conclude the paper with a summary of our results and future work.

The Standard HCV Model
The standard model that has been used and modified for studying hepatitis C viral dynamics is the Neumann et al. model [20]. The three variables this model keeps track of are the target cells T, in Equation (1a), the infected cells I in Equation (1b) and the extracellular virus V in Equation (1c). The target cells T are produced at constant rate s, and decreased by the number of cells infected by virus in blood V at constant rate β and their death rate d. The infected cells, I, increase with the new infections at rate βV(t)T(t) and die at constant rate δ. The virus V is produced at rate p by each infected cell and is cleared at constant rate c. The ε term denotes the effectiveness of the anti-viral treatment that decreases viral production from p to (1 − ε)p. The previously established ensemble of ODEs for this model is: From the mathematical perspective, the model is simple and can be solved analytically by appropriate assumptions.

The HCV Age-Based Multiscale Model
The multiscale model of HCV dynamics has been formulated in recent years [25,41,43] in order to study HCV dynamics in patients and decide among various treatment options. Figure 1 depicts a systematic overview of the model, which is described by the following partial differential equations (PDEs): subject to the boundary conditions I(0, t) = βV(t)T(t), I(a, 0) = Ī(a), R(0, t) = 1, and R(a, 0) = R̄ (a).
The variable I(a, t) for infected cells, that existed in the standard model as simply I(t), and the newly introduced variable R(a, t) for intracellular viral RNA (vRNA) both depend on the age of infection a and the time duration from therapy initiation t. Hence, a and t are two different times, with the use of partial derivatives in Equations (2b)  An important consideration in this model is that the treatment starts after the infection has reached its steady state. The steady states of the different functions are R̄ (a, t), Ī (a, t), V̄ and T. Given N, the total number of virions produced by a cell in its life-span, it was shown in Unlike the standard model, three different antiviral effects of therapy can be simulated in the multiscale model. The decrease in viral RNA synthesis is represented by ε α , the reduction in secretion by ε s and the increase in viral degradation by κ ≥ 1.
Through the method of characteristics, as was derived in Rong et al. [41], an analytical solution was found for the variable R(a, t). The same method was applied to derive a solution for I(a, t). The ensemble of Equations (2) represents the full model. The analytical solutions for R(a, t) and I(a, t) were described as follows: From the system of Equations (2) it can be noticed that computing V(t) necessitates an integral. If a < t, in other words the cell age is younger than the time of treatment, i.e., infection occurs after initiation of treatment, the term I(a, t) of the integral in Equation (2) depends itself on V(t) and T(t) by consideration of Equation (5). As was shown in Guedj et al., Rong et al., Rong and Perelson [25,41,43], this makes the analytical solution for V(t) approximative.

MATERIALS AND METHODS
The mathematical difficulties in deriving the long-term approximation, itself imprecise, hinders the generalization to more complex models. Numerical solutions are time consuming unless an efficient method with an adaptive stepsize is implemented, herein the Rosenbrock method [47]. We present how it derives from the Runge-Kutta family methods and its implementation for solving the model shown in Equation 2.
Runge-Kutta methods rely on computing the weighted average of a small increment from the starting position [48]. Those weights, b i , are predetermined. An advantage of these methods is that because of the use of Taylor series there exist a different set of weights b i * , also known, that allow to compute directly the approximation error. The number of terms needed to compute the next value is called the order S.

A General Introduction to the Rosenbrock Method
In this section we wish to approximate the function y(t) for solving the differential equation Finite-difference methods of higher order are used.
Given a known value of y at time step n, y n , compute the value at the next time step y n+1 by means of f (t n , y n ).

Explicit Runge-Kutta-
The explicit Runge-Kutta family is a generalization of the Euler method to higher order: where and a ij , b i , c i are pre-determined constants. They are usually displayed in a Butcher tableau as in Table 1. It is important to note the limits of the summation in Equation (8), from 1 to i − 1. In this manner, it is straight forward to compute k i at every step. A drawback of this method is the lack of stability for stiff problems [49]. The implicit Runge-Kutta methods are offering a solution to the instability problem.

Implicit Runge-Kutta-
The implicit Runge-Kutta methods are themselves a more stable version of the explicit Runge-Kutta family [48], which is recommended to be used in stiff problems. The equations extend to: and as before, a ij , b i , c i are pre-determined constants. The main difference lies in Equation (11) where the summation is taken from 1 to S. The difference becomes obvious when looking at the updated Butcher tableau ( Table 2). With the table now full, at each iteration a system of S equations with S unknowns needs to be solved. [47] a special case of the implicit Runge-Kutta methods is when all of the values in the Butcher tableau in the upper triangular matrix above the main diagonal are null and all the diagonal elements equal to a single value gamma (i.e., ∀i a ii = γ, Table 3). The value of k i shown in Equation (11) can now be simplified as follows:

The Rosenbrock Method-As shown in Rosenbrock
Additional substitutions [50,51] allow to rewrite the problem as follows: where J n is the Jacobian of f at step n. To evaluate the value y n+1 we now need to compute the values of the g i 's. One can notice how the left term of the equation contains a matrix with the term J n , which is the same for all g i 's. This is one of the key strategies of this method since to solve for g i there is only one matrix to invert per time step. It will be taken advantage of by performing an LU decomposition, as shown in the continuation.
Additionally the error e n+1 can be computed as e n + 1 : constants [50]. Given a threshold on the error, the strategy to decide if the step size must be increased or decreased consists in computing the value Δ n+1 :=e n+1 /(TOL×(y n + f (t n , y n ) + 10 −30 )). If Δ n+1 is smaller than 1 then the step size can be increased, else it must be decreased [52]. A good strategy in the latter case is to recompute the n + 1 value but with a smaller step size.

The Multiscale Model Solution
Since serum HCV RNA is stable in chronic HCV subjects, we assume that the model is at steady state before the start of treatment, as shown in Equation (3) and discussed in Rong et al. [41]. While individuals with recent HCV infection might not be at steady state at initiation of DAA therapy [3], it may be feasible to assume a quasi-steady state with lower pre-treatment HCV RNA levels, as observed in some subjects, which could be adjusted by changing model parameters.
We implemented the multiscale model with the Rosenbrock method of order S = 4. We define: where y n = We note that the term (1 − ε s )∫ 0 ∞ ρ(a)R(a, t)I(a, t) da is constant at each iteration and is computed between each one once. We can now explicitly write the values of the constants and describe the whole scheme. The Equations (13) and (14) become: Note that the computations of g 3 in Equation (19) and g 4 in Equation (20) use the same value of f, which implies that it needs to be computed only once. The error term also disregards the value of g 3 .
The scheme for one iteration is shown in Algorithm 1. An important observation is how the error term e is computed. It derives only from the error induced by the Rosenbrock iteration, not the computation of the integral term (1 − ε s )∫ 0 ∞ ρ(a)R(a, t)I(a, t) da.
The system (I /hγ − J n )g i = f y n + ∑ j = 1 i − 1 γ i j g j needs to be solved for four different values of i. Since the left hand matrix is constant, it is decomposed into its LU decomposition once. The Jama package [53] is used to perform the LU decomposition for solving the system in a highly efficient manner.

RESULTS AND DISCUSSION
A preliminary version of the Rosenbrock method was first implemented in Python3 using the SciPy library, freely available at http://www.cs.bgu.ac.il/~dbarash/HCVnumerics, and then converted to Java for the purpose of developing a user-friendly simulator with a graphical user interface (GUI) that is freely available at http://www.cs.bgu.ac.il/~dbarash/ HCVsimulator. Aside of the Rosenbrock method, the default implementation in SciPy of an ordinary differential solver leverages ODEPACK [54] and is referred to as the canned solver Default. For this entire section, unless stated otherwise, we used the parameters from Rong et al. [41] and shown in Table 4. The parameters that are changed through the results are the number of days, the size h of the steps for the ODEs, and the size h a of the steps for computing the integral. In the case of the Rosenbrock method that utilizes an adaptive stepsize, we bound the stepsize by h as minimum and h a as maximum.

Short-Term Approximation Holds for Half a Day
The short-term approximation [41] is shown in Figure 2 in blue. The results are shown for only 2 days since the behavior is smooth and conserved on longer time scales in agreement with previous results (Figure 2A in [41]). It is clear that after 12 h the value converges far from the PDE solution. This is expected since the effect of the treatment on the infection rate is not taken into account. In practice, most simulations are valuable for more than half a day and are focused on the long-term approximation.

Long-Term Approximation Underestimates PDE
In Rong and Perelson [43] it is hypothesized that the long-term approximation is an underestimate of the PDE model solution since some infection events are being ignored. However, with realistic parameters characteristic of potent therapy, the difference between them is very small. Figure 3 depicts the result of our scheme, measuring the difference between the long-term approximation and the solution of the multiscale PDE model ( Figure  2B in [41]). We show that the long-term approximation is converging and we are indeed obtaining that the long-term approximation is an underestimate of the PDE model solution (as the difference between the long term approximation and the solution of the PDE model is negative). The difference of less than 0.01 that we are obtaining is very small compared to the y-axis units shown in Figure 1 of Rong and Perelson [43]. Therefore, it is possible to view in Figure 3 this small effect that is an improvement over what was inferred in Rong and Perelson [43] regarding the long-term approximation. This finding is reiterated here for completeness and is referred to below in the sub-section about the Rosenbrock method.

The Model Equations Are Stiff
An important distinction of the age-based PDE HCV model equations is their stiffness, or how they can be numerically unstable even with a small stepsize. We have verified that with small time-steps the results are similar for all methods. But as we increase the size of the steps, the instability of the equations can be observed over 20 days. All explicit methods such as the standard version of Runge-Kutta quickly diverge from the solution with small time-steps. On the other hand an implicit Runge-Kutta method tends to the correct solution and the Rosenbrock method does so efficiently as reported below.

The Rosenbrock Method Is Efficient and Stable
As previously mentioned, the integral term (1 − ε s )∫ 0 ∞ ρ(a)R(a, t)I(a, t) da impedes the use of adaptive stepsize when a canned solver is used. Such methods are essential to allow fast computations of complex models like ours. We present in Table 5  Under those conditions a five-fold increase in speed, through the decrease in the number of iterations, provides a much needed advantage when testing large number of parameters over long time periods. Interestingly with the highest value of h, the Rosenbrock method takes additional iterations. This is due to the greediness of the time step adjustment, which when increased too quickly and thereby inducing a large error backtracks and starts again with a smaller value.
Simpler methods than Rosenbrock exist with adaptive stepsize, such as Dormand-Prince [48], Fehlberg (RKF) [55] and Cash-Karp (RKCK) [56]. These are explicit methods that can be easily implemented but are vulnerable to the stiffness of the equations, while Rosenbrock is an implicit method that is indeed found to be more stable. In the following comparisons we demonstrate that the Rosenbrock method is more efficient than the Runge-Kutta 4th order method and at the same time it is more stable than both the Runge-Kutta 4th order and the Dormand-Prince methods.
First, a comparison of the computing time between the Rosenbrock method and the other methods mentioned above is reported in Table 6 with starting time step values of h = 0.001 and h a = 0.01 for 14 days of simulation. Each run with a certain method was repeated 10 times, after which the mean and standard deviation was taken. The Rosenbrock method is over 8 times faster than the standard Runge-Kutta 4th order method, which is a non-adaptive time step method. While Dormand-Prince is an adaptive time step method that is even faster than Rosenbrock, as will be shown below, it is an explicit method that does not always converge to the correct solution. The fastest time of execution for the Dormand-Prince method is not surprising since an explicit method is easier to evaluate computationally than an implicit method, in particular because there is no need for matrix inversions.
Second, a comparison of the methods behavior over 14 days of simulation is presented in Figure 4. All the methods were started with time steps of h = 0.1 and h a = 0.1. The baseline (the default implementation in the SciPy library of an ODE solver) was computed for h = 0.001 and h a = 0.01 since those time step values are small enough to ensure that all methods converge to the same solution. The advantage of the Rosenbrock method in terms of stability and convergence to the correct solution is clearly noticed.
Finally, we also report in Table 7 the sensitivity of the Rosenbrock method. We compute the changed value of the result after 2 days of simulation given a variation of ±10% on 10 parameters. We notice that most of the perturbations are symmetrical. We omit ε α and ε s since they effectively determine the fraction of α and ρ used in the model. The results (Table   7) indicate that the Rosenbrock method is sufficiently robust for our needs.

The Multiscale Model Simulator
We have developed a user-friendly simulator with a GUI for the multiscale model ( Figure 5) that is freely available at http://www.cs.bgu.ac.il/~dbarash/HCVsimulator. The parameters file input is shown in Figure 6. For illustration, two example simulations are provided, one over 2 days (Figure 7) and one over 28 days (Figure 8). In Figure 7 we show how different parameter ensembles can be displayed simultaneously. In that case we chose ε α and ε s to be 0 or 0.99, resulting in three curves. We can observe that increasing the ε s parameter, which decreases the assembly/secretion of intracellular virions, has a strong short term effect that tempers quickly. In contrast the reduction in the synthesis of vRNA, modeled by an increase of the parameter ε α , shows a slower but more efficient effect that exhibits stronger results after half a day. Combining both factors shows a multiphasic viral decline which was observed under treatment with HCV NS5A inhibitors [25]. In Figure 8 we present the difference between the long-term approximation and the Rosenbrock method for those three sets of parameters over longer (than 2 days) treatment durations (e.g., 28 days). Although the highest error is present when ε s = 0, the error remains below a tenth of the differences of the log values. Thus, in all cases the trends from the first 2 days continue in the next 4 weeks and the error grows minimally.

SUMMARY AND CONCLUSIONS
Modeling intracellular viral RNA dynamics within infected cells is becoming an improtant mean for considering various curative treatment options. A viral dynamic model that considers intracellular viral RNA replication, namely an age-structured PDE multiscale model, has been recently put forth to study viral hepatitis dynamics during antiviral therapy [25,41,43]. This type of model is more complicated to solve than previous ODEs models. The seminal works that introduce the age-based multiscale model predominantly use analytical approximations, with some numerical solutions that are either based on simple first-order methods or canned solvers [25,41,43]. Neither of these numerical solutions are satisfactory for realistic simulations of several days of infection and therapy in terms of accuracy, efficiency and stability.
We first showed that the long-term approximation is an underestimate of the PDE model solution, which was anticipated in Rong and Perelson [43] because some infection events are being ignored in this analytical approximation. We then observed that the governing differential equations are stiff and therefore advanced numerical methods are needed. Methods with fixed stepsize or alternatively, the use of canned routines, do not offer a comprehensive solution to the model and are limited in scope. For example, in the multiscale model, there is an integral term that needs to be computed at each time step depending on previous iterations and this is inadequately done by canned solvers. Methods with adaptive stepsize offer a considerable improvement for realistic simulations. Having previously investigated these methods, we found the Rosebrock method to be the most accurate and efficient among adaptive stepsize methods for this model. We provide a simulator based on the Rosenbrock method with a user-friendly GUI that is freely available at http:// www.cs.bgu.ac.il/~dbarash/HCVsimulator.
Future work from the numerical standpoint could potentially include a more comprehensive treatment of the associated integral. In particular the adaptive stepsize techniques rely on the evaluation of the error produced at each evaluation of the ODE equations. Therefore, the error produced by the integral itself is never taken into account. Including that term would allow to relax the restrictions on the stepsize increase and potentially decrease further, beyond what is observed in Table 5 as least number of iterations, the number of iterations that are required to achieve a sufficient accuracy.
The Rosenbrock method implementation provided here can be generalized to potentially assist in understating treatment failure due to drug resistance by the expansion of the agebased model to include viral strains [57]. It could also be used to explore age-based models that include further aspects of intracellular HCV life cycle such as translation positive-strand HCV RNA and the synthesis of positive and negative-strand HCV RNA [58]. Finally, since the current age-based HCV multiscale model is a successful milestone but still fails to predict cure in some DAA-treated patients [59][60][61], further model modification will be needed [38,39] and future developments would necessitate some updates in the numerical method. While such models become quickly overwhelming to solve analytically, an extension of the presented method is straightforward. The simulator provided here based on the Rosenbrock method can be modified accordingly and be developed further to accommodate the modelers needs. A systematic overview of the multiscale model. The multiscale model accounts for the intracellular HCV RNA (vRNA) replication, R, i.e., synthesis, degradation and assembly/ secretion with rate parameters α, μ, and ρ, respectively. Treatment (parameters in red) may block vRNA synthesis with effectiveness ε α , with an additional time-dependent drugmediated decrease e −γt and/or virion assembly/secretion with effectiveness ε s and/or enhance the degradation rate of vRNA by a factor κ. T and I represent target and infected cells, respectively, and V represents extracellular virus. Target cells are created and die with constant rates s and d (shown as D in the GUI), respectively, and can be infected by virus, V, with rate constant β. Infected cells, I, are lost with rate constant δ and virus, V, is cleared from blood with rate constant c. The log values of the short-term and long-term approximations developed in Rong et al. [41] are shown compared to the PDE solution using the Rosenbrock method. Model parameters are described in Table 4 with h = 0.001, h a = 0.01. The difference between the long-term approximation, developed in Rong et al. [41], and the PDE solution using the Rosenbrock method described herein is computed. Model parameters are as in Figure 2. The method parameters are h = 0.001, h a = 0.01.  Graphical user interface (GUI). The displayed values are the default options and can be modified. At the bottom are the options to add experimental data and parameters values from a file. Note that setting the four parameters ε α , ε s , κ and γ (red fonts) to 0 will keep the system in the pre-treatment steady state. An option to export the parameters is also available. An example of a parameters file that can be automatically uploaded. Lines starting with a # are comments (ignored). This file can also be exported from our software, allowing users to share them easily. Several simulations can be displayed simultaneously. Three treatment scenarios are plotted with similar results as previously shown in Rong et al. [41]. Difference in the log values between the long-term approximation [41] and the Rosenbrock method when choosing ε α and ε s from (0, 0.99). One can notice how even at the longer treatment duration the difference remains minimal and mostly constant.  The parameters of the model used in all simulations, taken from Rong et al. [41].   Each of the 10 parameters was decreased and increased by 10% of its original value. For each we present the ratio of the result after 2 days over the result with the original value. We omit ε α and ε s since they effectively determine the fraction of α and ρ used in the model.