From Infection to Immunity: Understanding the Response to SARS-CoV2 Through In-Silico Modeling

Background Immune system conditions of the patient is a key factor in COVID-19 infection survival. A growing number of studies have focused on immunological determinants to develop better biomarkers for therapies. Aim Studies of the insurgence of immunity is at the core of both SARS-CoV-2 vaccine development and therapies. This paper attempts to describe the insurgence (and the span) of immunity in COVID-19 at the population level by developing an in-silico model. We simulate the immune response to SARS-CoV-2 and analyze the impact of infecting viral load, affinity to the ACE2 receptor, and age in an artificially infected population on the course of the disease. Methods We use a stochastic agent-based immune simulation platform to construct a virtual cohort of infected individuals with age-dependent varying degrees of immune competence. We use a parameter set to reproduce known inter-patient variability and general epidemiological statistics. Results By assuming the viremia at day 30 of the infection to be the proxy for lethality, we reproduce in-silico several clinical observations and identify critical factors in the statistical evolution of the infection. In particular, we evidence the importance of the humoral response over the cytotoxic response and find that the antibody titers measured after day 25 from the infection are a prognostic factor for determining the clinical outcome of the infection. Our modeling framework uses COVID-19 infection to demonstrate the actionable effectiveness of modeling the immune response at individual and population levels. The model developed can explain and interpret observed patterns of infection and makes verifiable temporal predictions. Within the limitations imposed by the simulated environment, this work proposes quantitatively that the great variability observed in the patient outcomes in real life can be the mere result of subtle variability in the infecting viral load and immune competence in the population. In this work, we exemplify how computational modeling of immune response provides an important view to discuss hypothesis and design new experiments, in particular paving the way to further investigations about the duration of vaccine-elicited immunity especially in the view of the blundering effect of immunosenescence.


INTRODUCTION
The global pandemic set up by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in the early months of the year 2020 has reached considerable proportions and, to date, does not show signs of a slowdown when considered globally. In fact, as of 3:10pm CEST, 24 April 2021, there have been 145,216,414 confirmed cases of COVID-19, including 3,079,390 deaths, reported to WHO (1).
The mortality rates of the SARS-CoV-2 greatly differ across the globe, ranging from 0.8 to 9.2% (2), as a result of many factors including the ability to react to the pandemic by the various national health systems.
The COVID-19 disease has a quite variable clinical presentation: while the majority of individuals present with very mild disease, often asymptomatic, a few patients develop a life-threatening disease requiring intensive care. Recent review papers describing the characteristics of the virus SARS-CoV-2 and the disease COVID-19 can be found in (3). The strongest determinant of disease severity is age, with children presenting almost exclusively with mild disease, while individuals over 70 years of age are much more likely to develop severe COVID-19. This variation is likely due to both host and pathogen factors. Host factors may include differences in the immune response due to genetic determinants and immunological history. On the other hand, pathogen factors include transmission, entry and spread within the host, cell tropism, virus virulence, and consequent disease mechanisms.
To better understand what impact these factors may have in the differences observed in the host response to SARS-CoV-2, we set up the analysis of the dynamics generated by a computer model that considers both, the magnitude of the viral harm, and the subsequent innate and adaptive response set up to attempt achieving control of the infection. Thus, we used computer simulations to create a virtual cohort of infected individuals to study the effects on the pathogenesis of both host and pathogen factors. Note that this approach goes beyond the machine learning paradigm as the knowledge is generated through a set of equations/ algorithms confirmed by the scientific literature and by past models. The simulation allows systems-level, multi-evidence analyses to simultaneously capture the dynamics of the major immune cell populations and the many protein mediators by which cells communicate, to sort out the determinants of disease severity.

SIMULATING SARS-COV-2 COURSE OF EVENTS IN THE HOST
Up to date, there have been a large number of attempts to use mathematical and computational models to elucidate various aspects of the pandemic, from virus infection of the host to the pathogenesis of COVID-19 and the epidemiological aspects of viral spread in the population including the efficacy of containment measures (4). All kinds of modeling methodologies have been employed so far from classical differential equation models to describe the system dynamics to machine learning techniques to analyze available data (5).
The simulation model that we used in this study is a hybrid agent-based model for the simulation of the immune response to generic pathogens. It is equipped with elements of innate immunity consisting in macrophages, dendritic cells, natural killer cells, proinflammatory cytokines (e.g., IL-6, IL-12, IL-18, TNFa, IFNg), and of the adaptive immunity represented by B lymphocytes, plasma B antibody-producing lymphocytes, CD4 T helper and CD8 cytotoxic T lymphocytes. It is a polyclonal model as it embodies the primary sequences of binding sites of Bcell receptors (BCRs) and T-cell receptors (TCR), as well as the peptides and epitopes of the infectious agent (i.e., the SARS-CoV-2 in this case) (6).
The model (called C-IMMSIM) represents a portion of, i) primary lymphatic organs where lymphocytes are formed and mature (i.e., mainly the red bone marrow and the thymus gland), ii) secondary lymphoid organs (e.g., a lymph node), which filters lymph, and where naïve B and T-cells are presented to antigens, and, iii) peripheral tissue which is dependent on the pathogen considered (in this case the lung).
While primary organs are just the source of lymphocytes equipped with a randomly generated receptor (actually only its complementarity-determining region, CDR), the secondary organs and the tissue are mapped onto a three-dimensional Cartesian lattice (however it is worth to specify that, since the initial condition, i.e., the initial cell counts, is uniform on the lattice and the diffusion is isotropic, spatiality does not really play a role in the present study). The thymus is implicitly represented by the positive and negative selection of immature thymocytes (7) before they enter the lymphatic system (8), while the bone marrow generates already immunocompetent B lymphocytes.
Being a general-purpose modeling platform, C-IMMSIM lends itself to characterize the role of the immune response in different human pathologies. For instance, in simulating viral infections such as HIV we have depicted the evolutionary path of the wild type virus inside the host due to its high replication rate (26); in the case of EBV infection, we have shown that the ability of the virus to establish long term persistence is dependent on access of latently infected cells to the peripheral pool where they are not subject to immunosurveillance (27). While simulating cancer immuno-prevention we have shown, as supported by in vivo mice experiments, that the humoral response is fundamental in controlling the tumor growth (28,29). In a study of type 1 hypersensitivity we have elucidated the role of timing and dosage in the administration of anticancer drugs with respect to the risk of having an allergic reaction (30). While reproducing the pathogenesis of type 2 diabetes we have pinpointed the deleterious effects of a chronic inflammation (31). The model has also been used to describe specific aspects of the immune dynamics such as lymphocytes homing in lymph nodes (32), the gene regulation leading to cell differentiation (33), the clonal dominance in heterologous immune responses (34) and also vaccinationeliciting fish immunity (35). Finally, and relevant to the present study, the model has recently been used to test in silico the response to a multi-epitope vaccine against SARS-CoV-2 (36,37).
In C-IMMSIM each simulated time step corresponds to eight hours of real life. Cells diffuse randomly in the represented volume and interact among them. Upon specific recognition through receptor bindings, they perform actions that determine their functional behavior. These probabilistic rules define the transition of the entities from one "condition" to another. Each rule is executed only if the involved parties are in enabling states (e.g., naïve, active, resting, antigen-presenting).
Besides cell-cell interaction and cooperation, the model simulates the intra-cellular processes of antigen uptake and presentation. Endogenous antigens are fragmented and combined with MHC class I molecules for presentation on the cell surface to CTLs' receptors (this is the cytosolic pathway), whereas exogenous antigens are degraded into small pieces, which are then bound to MHC class II molecules for presentation to T helpers' receptors (this is the endocytic pathway).
The stochastic execution of the algorithms that code for the dynamical rules of the automaton, results in a sequence of cause/ effect events culminating in the production of effector immune cells and setup of immunological memory. The starting point of this series of events is the injection of an antigen which, here, consists of the virus. This may take place any time after the simulation starts (the sequence of events of the SARS-CoV-2 simulation is reported in Appendix A of the Supplementary Materials). Initially, the system is "naïve" in the sense that there are neither T and B memory cells nor plasma cells and antibodies. Moreover, the system is designed to maintain a steady-state of the global population of cells (homeostasis of the normal peripheral blood leukocyte counts), if no infection takes place.
Besides the parameters defining the characteristics of the virus related to attachment, penetration, replication, and assembly (i.e., its fitness), the SARS-CoV-2 virus in this model is defined as a set of B-cell epitopes and T-cell peptides consisting of amino acid 9-mers and defining its antigenicity. If the infection is stopped or becomes persistent or even kills the virtual patient it depends on the dose of the virus, its fitness, and the strength of the immune response aroused. All of these variables determine if, and to what degree, the success of the immune system requires the cooperation of both the cellular and the humoral branch, as shown in past simulation studies (38).
To improve the peptide-prediction performance, the most important difference to the previous version of the model (39) is that, rather than using position-specific-score-matrices (PSSM) to weight the binding contribution of the amino-acids composing the protein segments in the bounds (40,41), we resort to pre-computed ranked lists of T-cell epitopes calculated with the original neural network NetMHCpan method (42)(43)(44). This feature, which is described below, follows the choice of a specific Human Leukocyte Antigen (HLA) set as described in the next section. A diagram of the model components is shown in Figure 1.

Selecting the HLA Haplotype
The C-IMMSIM model accounts for differences in the HLA haplotype when determining which peptides are presented by antigen-presenting cells. To this end, for each HLA molecule, it takes in input a list of such peptides together with a propensity of each of them to bind to the HLA. This list is computed by using third-party immunoinformatics tools as described in the next section (Computing the peptide immunogenicity).
The "HLA haplotype freq search" in the "Allele Frequency Net Database" (45) was used to select two HLA-A, two HLA-B and two DRB alleles which are most prevalent in the US population (46). The result pointed to the following alleles:

Computing the Peptide Immunogenicity
The strain of SARS-CoV-2 used in this study corresponds to the reference sequence NCBI Reference Sequence: NC_045512.2. The primary structure of these proteins has been used to identify cytotoxic T peptides (CTL peptides) and helper T peptides (HTL peptides). To this aim, we have employed two immunoinformatics tools. In particular, for the definition of CTL epitopes, the "ANN 4.0 prediction method" in the online tool MHC-I binding prediction of the IEDB Analysis Resource (47) was used for the prediction of 9-mer long CTL peptides which had an affinity for the chosen set of HLA class I alleles (i.e., HLA-A*02:01, HLA-A*24:02, HLA-B*35:01 and HLA-B*40:02) (48)(49)(50). The peptides were classified as strong, moderate, and weak binders based on the peptide percentile rank and IC50 value. Peptides with IC50 values <50 nM were considered to have high affinity, <500 nM intermediate affinity, and <5000 nM low affinity towards a particular HLA allele. Also, the lower the percentile rank, the greater is meant the affinity (48)(49)(50). The list of peptides is reported in Appendix B of the Supplementary Materials.
For what concerns the HTL peptides, the NetMHCIIpan 3.2 server (51) was used for the prediction of 9-mer long HTL peptides which had an affinity for the HLA class II alleles (i.e., DRB1*07:01 and DRB1*15:01) used in this study (52). The predicted peptides were classified as strong, intermediate, and non-binders based on the concept of percentile rank as given by NetMHCII pan 3.2 server with a threshold value set at 2, 10, and >10%, respectively. In other words, peptides with percentile rank ≤2 were considered as strong binders whereas a percentile rank between 2 and 10% designate moderate binders; peptides with percentile score >10 are considered to be non-binders (52). The list of CTL and HTL peptides and the relative affinity score is reported in Appendix C of the Supplementary Materials.

Quantifying the Immunological Competence
It is widely accepted that aging is accompanied by remodeling of the immune system. With time, there is a decline in overall immune efficacy, which manifests itself as an increased vulnerability to infectious diseases, a diminished response to antigens (including vaccines), and a susceptibility to inflammatory diseases. The most important age-associated immune alteration is the reduction in the number of peripheral blood naïve cells, accompanied by a relative increase in the frequency of antigen-specific memory cells. These two alterations are extensively reported in the literature and account for the immune repertoire reduction (53,54). Along with the process called "inflammaging", the reduction of immune repertoire is considered the hallmark of immunosenescence (55).
To model the reduction of immune efficacy we first defined the parameter "immunological competence" IC∈(0,1] and assumed it in a simple linear relationship with age. Specifically, we set IC ≡ IC(age) = -a age + 1 with the value of the parameter a = 45 · 10 -4 determined using epidemiological data as described below. Given the age, the parameter IC is then used to modulate both innate and adaptive immunity as follows: i) the phagocytic activity of macrophages and dendritic cells, represented by a probability to capture a viral particle, is rescaled respectively as p M = IC · u and p DC = IC · v, where u~U [a,b] and v~U [5×a,5×b] are two random variables uniformly distributed in the ranges [a, b] and [5 · a, 5 · b] with a = 25 × 10 -4 and b = 10 -2 ; ii) as for the adaptive immunity it is adjusted according to the immunological competence parameter IC by decreasing the lymphocyte counts (hence B, Th, and Tc) to reflect a reduction in the repertoire of "naïve" cells with immunological history due to accumulation of memory cells filling the immunological compartment. In particular, the number of white blood cells N is computed as N ∼ IC · N (m, s 2 ) = N (IC · m, IC 2 · s 2 ) where N (m, s 2 ) is a normal distribution with average m and standard deviation s (for each lymphocyte type B, Th and Tc) chosen to reflect the reference leukocyte formula for an average healthy human adult (see Figure 2) (56).

Adapting the Model to SARS-CoV-2 Characteristics
The infection and the dynamical features of the SARS-CoV-2 viral strain have been characterized by two parameters: i) V 0 corresponding to the infectious viral load at time zero, and, ii) the affinity of the virus spike protein to the ACE2 receptor on target h e immunocompetence value IC(age) hence p A and p DC as well as the lymphocyte counts, the simulations depict the immune-virus competition eventually culminating in a successful, or not, virusclearing response controlling its growth. Sometimes this control is not perfectly efficient. In those cases, the result is a longer viral persistence possibly going much beyond the length of the observation period of 30 days (cf. Figure 4).
The sequence of events from viral infection leading to a full-fledged immune response is detailed in Appendix A of the Supplementary Materials. At each time step of the simulation, C-IMMSIM dumps all variables allowing for a detailed analysis of the dynamics. A full output example of a simulation is reported and described in Appendix D of the Supplementary Materials.

MODELING A REPRESENTATIVE COHORT OF INFECTED INDIVIDUALS
We have simulated a large number of infections (1500 for each age-class for a total of 10500 independent simulations) by varying the parameters identifying both the viral characteristics and the individual immunological competencies. The seven age classes considered were 0-9, 10-39, 40-49, 50-59, 60-69, 70-79 and 80+. As specified above, the age-class determines the immunocompetence parameter IC which, in turn, sets p M and p DC as well as the lymphocyte counts in the in-silico individual, we can characterize each simulation by the set of parameters (IC (age), V 0 , p A ). Moreover, due to the stochasticity of the model depending on the random number realizations, each simulation corresponds to a different trajectory in the space of the variables. It follows that each simulation coincides with an in-silico patient with variable immunological characteristics (IC) and, at the same time, infected by a slightly different viral burden (V 0 and p A ).
The intervals within which these parameters vary have been chosen to reproduce the age-class incidence of disease severity of infected individuals. The age-class incidence varies wildly among regions mostly due to a different definition of COVID-19 related deaths. Moreover, due to the lack of confirmation of the causes of death in many cases in periods of high emergency as that of March/April 2020 in Italy, these rates should not be considered strictly but rather indicative of the negative exponential-like relationship of the death incidence with age. Reference values we used were the fatality rates from the Chinese Center for Disease Control and Prevention (CDC) as of 17th February, the Spanish Ministry of Health as of 24th March, the Korea Centers for Disease Control and Prevention (KCDC) as of 24th March, and the Italian National Institute of Health, as presented in the paper by Onder et al. (57) as of 17th March (57,58).
To reproduce this age-related incidence of in-silico cases we linked the simulated viral load at a certain time to the clinical status (clinical endpoint). This has been done according to the rationale that a patient whose viral load is still quite high after thirty days from infection can be considered at a very high risk of death. In fact, in most mild cases, the clinical signs and symptoms (mostly fever and cough) have been reported to resolve within 3 weeks from the diagnosis (which translates into approximately 30 days from infection). Instead after 3 weeks, several authors have described severe cases with progressive multi-organ dysfunction with severe acute respiratory dyspnea syndrome, refractory shock, anuric acute kidney injury, coagulopathy, thrombocytopenia, and death (59).

Stratifying the In-Silico Cohort of Patients
The analogy of some simulation variable to a realistic clinical endpoint allows stratifying the in-silico patients for a more concrete interpretation of the results. Based on the viral load observed at day 30 (indicated by V 30 ) and a threshold q we assume to identify the virtual patients in one of the three following classes: • Critical: if V 30 > q, namely, the viral load at day 30 is still high; this class includes weak and late responders; • Partially recovered: those who are still positive but have a low viral load, meaning that the immune response is controlling the viral replication (i.e., 0 < V 30 ≤ q); note that this class includes the asymptomatics; • Fully recovered (or just Recovered): those who have cleared the virus (i.e., V 30 = 0).
According to this assumption/definition, by choosing the cutoff q =120 viral particles per micro-liter of simulated volume, we obtain the stratification of the virtual individuals shown in Figure 3A. Of all in-silico individuals, we get 4.3% of critical cases (broken down in age classes in Figure 3A), 46.8% of partially recovered ( Figure 3B), and 48.8% recovered cases ( Figure 3C). These figures sound very much in line with current epidemiological statistics when considering that the recovered cases here simulated include asymptomatics (57). This result shows an interesting and surprisingly high fraction of in-silico patients in the partially recovered class. This class includes patients that, at the end of the simulated period of thirty days, are still positive albeit manifesting an active immune response, regardless of being asymptomatic or not. This question is discussed below.
These special cases can be better examined in Figure 4, which shows four distinct exemplifying runs with different outcomes. In Figure 4A the viremia is shown as a function of time. Red lines correspond to individuals who reach the critical condition V 30 > q thus falling in the class "critical". The green line corresponds to a viral clearance corresponding to a fully recovered case, and the blue line shows a situation in which the virus is not completely cleared but stays below the threshold value q. This case corresponds to one of what we call partially recovered as it represents virtual individuals that produce an immune response (cf. same figure, Figure 4A showing the corresponding antibody titers) which turns out to be insufficient to clear the virus. These "unresolved infections" include asymptomatic cases and are worth the further analysis described below.
To note that the two examples of critical outcomes (red curves) originate from a quite different initial viral load. Also, to note that the fully recovered (green) case starts with a viral load that is higher than one critical case, still the immune response manages to control the infection. The blue curve shows a partially recovered case which greatly decreases the viral load (inset plot of Figure 4A) but does not clear it completely.

How the Model Explains Symptoms
It is worth clarifying that the term "symptom" has no meaning in the in-silico framework unless a link between model variables and possible clinical endpoints is drawn. Also, we should note that we have no concept of comorbidity here that would help in defining the "status" of the virtual patient. To overcome this limitation, besides the viral load at day 30, we think up the following quantities (or variables) as clinical endpoints: (a) the damage in the epithelial compartment, namely, percent of virustarget cells that are dead at the time of observation as a surrogate marker of vascular permeability; (b) the concentration of pyrogenic cytokines as a surrogate marker of fever, i.e., Prostaglandins TNFa, IL-1, and IL-6 causes fever people get varying degree of severity with COVID-19.
Of these two potential surrogate markers of criticality, the first appears more appropriate. While the amount of pyrogenic cytokines (surrogate clinical endpoint b.) correlates with the severity of the disease, the most striking difference in the critical cases versus non-critical (i.e., recovered plus partially recovered) is seen when comparing the accumulated damage in the where E(t) is the epithelial count per microliter of simulated volume and T is the time horizon of 30 days (note that j ∈ [0,1]). Indeed, when we plot the distribution of j for the critical and non-critical cases (i.e., partially recovered plus fully recovered) we obtain what is shown in Figure 5. The plot clearly shows that for critical in-silico patients the damage is much more pronounced than for non-critical ones. This prompt us to use the threshold j c = 0.63 to set apart patients who have mild infections [about 80% as in reality (60-62)] to those having a severe disease [15% with dyspnea, hypoxia, lung changes on images (63)] or critical illness [5%, respiratory failure, shock, multi organic dysfunction, cytokine storm syndrome (64)]. Therefore, we label patients with j ≥ j c as symptomatic while those with j < j c asymptomatic. According to this further stratification patients who are still positive (i.e., 0 ≤ V 30 < q) and have no symptoms (i.e., j < j c ) account for about 44% of the simulations which is in line with current estimates of asymptomatic incidence (65).

The Model Indicates That a High Infective Viral Load Carries a Serious Risk
We tested the correlation between the antigen abundance (or infective viral load V 0 ) and the severity of the infection. The Mann-Whitney-Wilcoxon (MWW) test shows a significant (i.e., p-value< 10 -3 ) difference between infecting viral load V 0 in the three classes critical, partially recovered, recovered. In particular, we find that a higher V 0 is a strong correlate of disease severity (66). Of interest is the fact that there is no significant difference among age groups, that is, V 0 is not predictive of the disease severity for the age (67) (MWW test p-value>0.05).   Significantly, in most critically ill patients, SARS-CoV-2 infection is associated with a severe clinical inflammatory picture based on a severe cytokine storm that is mainly characterized by elevated plasma concentrations of interleukin 6 (68). In this scenario, it seems that IL-6 owns an important driving role in the cytokine storm, leading to lung damage and reduced survival (69). The simulation agrees with this finding as reported in Figure 6. Plotting the peak value of the viral load (i.e., the maximum value attained in the observed period) versus the logarithm of the integral of IL-6 over the whole period (cf. eq(2) in next section 3.5), we see a positive correlation regardless of the outcome.
For all age classes, as shown in Figure 7A, a critical clinical course is associated with a significantly higher concentration of proinflammatory cytokine IL-6. Figure 7B shows the same information for all age classes lumped together. The cytokine concentration on the y-axis is calculated as the integral over the whole simulated period (definition in eq(2) of section 3.5). The positive correlation between inflammation and severity of the clinical course is a consequence of the struggle of the immune system to cope with the infection. However what Figure 6 reports is a generic higher production of IL-6 in younger individuals compared to elders. The explanation of this outcome becomes visible following the line of consequences starting from a stronger cytotoxic activity (see Figure 12B below) that killing infected cells cause a stronger release of danger signal to which macrophages respond by secreting IL-6. Since younger have a higher immunological competence (IC), they respond with both stronger cytotoxic response and better innate (i.e., macrophage) activity. The result is the somehow counterintuitive observation that while younger individuals have a higher degree of inflammation, they report a smaller propensity to experience severe outcomes.

Younger In Silico Individuals Deal With the Virus by Producing More Cytokines
What is observed in the production of IL-6 in younger individuals extends to all cytokines produced during the response to the inflammation. We find that, in general, cytokines' cumulative production during the whole simulated period correlates inversely with the age. Calling c x the concentration of cytokines x in the simulated volume, where x is one of IL-6, D, IFNg, and IL-12, we define the cumulative value of cytokines in the whole observation  period. Figure 8 shows s IL6 , s D , s IFNg , and s IL12 against age. What Figure 8 shows is that there is a clear reduction of cytokines' production with age. This, similarly to what was discussed in the previous section, is due to the reduced immune activity which indeed is determined by a reduced immunological competence with the age (71). To justify the apparent contrast with the fact that elder acute infected individuals are more prone to experience a cytokine storm, we should openly regard one of the limitations of the model, namely, the lack of further cytokine feedbacks that are activated during extended pneumonia.

Model Predicted IFNg Concentration Is Higher in Milder Courses of the Infection
The expression of IFNg by CD4 tends to be lower in severe cases than in moderate cases as shown in Figure 9A. This is in agreement with (72). The inverse correlation of interferon-gamma (IFNg) with disease severity is observed in all age groups ( Figure 9A and also in Figure 9C when summing all age classes). Interestingly, recovered and partially recovered do not show a meaningful difference when compared to the critical cases ( Figure 9C).
IFNg is released by natural killer (NK) cells upon bystander stimulation by danger signals (Rule n.5 in Appendix A of Supplementary Materials) which, in turn, is released by infected/injured epithelial cells upon viral infection (Rule n.3) and when killed by cytotoxic cells (Rule n.18). This means that in young individuals a prompt activation of NK cells due to higher immunological competence, and a stronger cytotoxic response killing infected cells, are sufficient to control the "acute" production of danger signal impacting on the production of IFNg.

In Silico Cytokine Storm Goes With Symptoms
If we use the cumulative value of inflammatory cytokines as a variable, namely, s IFNg + s IL6 + D + s TNFa (i.e., the sum of the integrals) and we use the accumulated damage in the epithelial compartments, that is, the fraction of depleted epithelial cells due to the immune cytotoxicity j defined in eq(1) (cf. section 3.2) as the discriminating criteria between symptomatic and asymptomatic, we observe what is shown in Figure 10.
Compared with asymptomatic cases, the symptomatic ones more frequently have markedly higher levels of inflammatory cytokines. The difference of the virtual patients in the two classes is statistically significant (MWW test, p-value< 10 -3 ), which is in line with the clinical finding of higher inflammatory levels in severe disease progressions (72). This result seems to contrast what is stated in section 3.5, namely that younger individuals produce more cytokines but have a less severe course of the disease. However, the explanation provided by the simulation is that those who deal with the infection more rapidly (including asymptomatic) produce, overall, a smaller amount of cytokines and therefore have a lower risk of "cytokine storm". On the other hand, an inconclusive immune response leads to chronic inflammation, hence pronounced symptoms (e.g., extended epithelial damage) and ultimately in a cytokine storm.

Why Is the Predicted Immune Response Quicker in Younger Individuals?
It has been suggested that in younger individuals several factors contribute to the lower numbers of patients observed with severe disease, namely: lower number of ACE receptors, overlapping immunity against coronaviruses, and a more efficient intact immune system. Figure 11 shows that the immune response is quicker in younger virtual individuals compared to elder ones. The speed of the immune response is calculated in terms of the time [in days) the viral load V(t) reaches its maximum [indicated t V max where t:V(t) = V max and V max = max t V(t)] and starts to decline due to the immune response. Figure 11A shows the distribution of t V max for each age class. Younger individuals develop a faster response and consequently, the virus is cleared earlier. This is shown in Figure 11B which plots the distribution of the time (in days) it takes the immune response to decrease the viral load below the threshold q whenever this happens (the cases for which V(t) > q,∀t, are not counted in this statistics). Figure 11B is in line with the fact that younger individuals mount a quicker immune response that is generally more efficient than those in elder people thus eradicating the virus in a shorter time.
The Key Role of the Humoral Response Figure 12 shows that younger individuals have a higher production of antibodies when compared to elder individuals. This is evident for both critical and recovered (partial or fully recovered) individuals. . Panel (A) shows that the concentration of IFNg is lower in severe cases than in moderate cases. This is in agreement with (72). When summing all age classes (C) IFNg shows an inverse correlation with disease severity. The same is observed when looking at each age group separately (B). Interestingly, recovered and partially recovered do not show a meaningful difference when compared to the critical cases (C).
However, the most striking observation when considering the difference between recovered and critical cases is the gap in antibody titers present in virtually all age classes ( Figure 12A). This indicates a strong protective role of the humoral response making a split between recovered and critical patients. Figure 12B shows the corresponding statistics for the cytotoxic T cell (peak value) count per age-class and critical status. This plot consistently evidences that youngers have a stronger response than elders. Interestingly, in contrast to the humoral response, in all age classes, the cytotoxic response in critical individuals is higher than in recovered ones revealing the attempt of the immune system to counterbalance the inefficacy of the humoral response.
Moreover, a further view at the antibody titers reveals that its peak value [i.e., V max = max t V(t)] correlates inversely with the clearance time (t q ), that is, faster responses are obtained with lower production of antibodies (cf. Figure 13). This is in line with the hypothesis that asymptomatic individuals develop a rapid but mild response which clears the infection (73).
It should be noted, however, that there are still substantial uncertainties on the data available due to the variable diagnostic accuracy of different serological tests for COVID-19, therefore more well-designed large clinical studies are warranted to address this matter (74,75). Interestingly, the same cannot be said when considering peak values of Tc counts, that is, the cytotoxic response does not correlate, either positively or negatively, with time-to-clearance (not shown).

Antibody Titers Have a Prognostic Value After Day 25
We have used a logistic regression classification to see if by measuring the antibody titers and the CTL counts at day t < 30  (1) i , x (2) i )is the feature vector consisting of the normalized cytotoxic T-cell lymphocytes count, x (1) i = Tc(t)and the antibody titers at day t, x (2) i = Ab(t) : While y i = 0 if the corresponding run has V 30 ≤ q and y i = 1 if the corresponding runs has V 30 > q. Figure 14A shows the features x i corresponding to the recovered cases (i.e., y i = 0) represented as yellow circles and the critical cases (i.e., y i = 1) corresponding to black daggers. This panel shows the best separation curve found after training a logistic regression model on the training sample x i = [Tc (25), Ab (25)], namely, Tc count and antibody titers measured at day 25 from infection. Figure 14A shows the data set after the classification in recovered and critical and the separation curve. In the figure, the data set corresponds to the observation at day 30 while the analysis has been conducted at different time points. Figure 14B shows how the classification accuracy increases over time. In this panel, we plot the Sørensen-Dice coefficient [most known as the F1 score (76)] which increases when the assessment is made by using features (i.e., Tc and Ab measurements) later as the infection and corresponding immune response develops. Interestingly, before day 10 after the infection it is not possible to find a meaningful classification criterion that predicts the outcome, while the Sørensen-Dice coefficient increases to a high value already at day 25 indicating that 25 days after infection, the level of immune activation represented by the antibody titers and cytotoxic counts is predictive of the clinical outcome.

DISCUSSION AND CONCLUSIONS
The immunological correlates of COVID-19 are far from being elucidated in clinical studies. Simulation studies can help to disentangle the importance of factors such as a reduced ability to mount an efficient (i.e., not off-target) immune response due to age or the infective viral load determining the initial viral burden.
We have set up a computational model that simulates the infection with a varying dosage of the virus and with a slightly different affinity to the ACE2 receptor of target cells, in individuals with different immunological competence.
The results of a large number of simulations that we call virtual or in-silico cohort demonstrate that the great variability observed in the real pandemic can be the mere result of such diversity in both viral and human characteristics.
The computational model used can explain several clinical observations of SARS-CoV-2 infection. In particular, it evidences the importance of the humoral response in discriminating efficiently from poor immune responses which fail to completely clear the infection and, in some cases, bring the viral load down below a threshold value while showing no markers of symptoms.  The model has been tuned for parameters able to reproduce the relationship of age with the disease severity (cf. Figure 3). Starting from that, any other observation revealed an emergent property of such a complex simulation environment. In particular, we observe the correlations among infective viral load V 0 and severity, among immunological (in)competence (thus age) and severity, among the overall cytokine levels and symptoms (i.e., a virtual cytokine storm), and, finally, the key role of the humoral response in clearing the infection yet sustained by the cytotoxic activity (cf. Figure 12).
Indeed recent clinical data suggest that several hyperinflammation markers can serve as accurate and reliable tools to identify mild or severe cases of COVID-19 infection (77).
Within cytokines, interleukin-6 (IL-6) is widely accepted to play a pivotal role, therefore it has been considered a possible therapeutic target. Indeed, the IL-6-receptor antagonist tocilizumab, has been used to treat patients with severe COVID-19 symptoms and pneumonia. However, clinical trials exploring tocilizumab's therapeutic effects in patients with lifethreatening SARS-CoV-2 infection have yielded very conflicting results (78). It should be noted that IL-6 production in vivo and real patients is usually accompanied by a relevant array of other cytokines and chemokines, and because of this, the therapeutic activity of tocilizumab might have been significantly hampered in some patients. There are also other potential confounding factors, for example, types of patients recruited in different trials and timing of treatment (early vs late).
It should be pointed out that in the context of this rapidly evolving scenario, defining best clinical practices is very challenging since results from ongoing trials are updated globally on a daily basis. Also in a global pandemic overall population outcomes may be measurable only during a long follow-up.
Our simulation study identifies day 25 after infection (which we can roughly associate to about day 15 th -18 th after the appearance of the symptoms) as the time for predictive measurement of the antibody response to assess the risk of developing a severe form of the diseases. Before that time, our data suggest that the prediction is not statistically meaningful. Manifestly, the choice of antibody titers as marker to test for predictivity is due to the large availability of testing facilities and the short time and low cost for obtaining such readouts in clinics.
The model is restricted in many aspects. It simplifies reality and works with a limited number of mechanisms and a reduced diversity. For instance, it does not include type I IFN which has been recently evidenced as potentially important in the resolution of the infection (76,(79)(80)(81). However, type I IFN dynamics during SARS-CoV-2 infection are not yet fully clarified and future studies will explore whether IFN production is reduced at the beginning of infection or whether it is delayed or exhausted after an initial normal response (82)(83)(84). Moreover, our model does not reproduce diverse organs and tissues and therefore we cannot observe site-specific pathological problems, including the spatial extension of pneumonia. Nevertheless, the analysis conducted in the present work accounts for such limitations and the results obtained can be reasonably considered independent of such restrictions.
Of course, there are some cases reported in the literature in which the course of infection has been extremely long, especially in severely immunocompromised patients. However, the very complex and lengthy dynamics taking place in those cases are beyond the scope of this study, which instead represents at large the majority of observed cases.
Finally, we should consider that the clinical ground of observation inevitably starts much later than in our model, as people ask for medical attention only after developing symptoms or after knowing of accidentally having been in contact with patients/carriers. Therefore, the window of observation we consider in this paper is recapitulating more precisely the infection dynamics of the early days.
Potential study directions should cover the duration of immunity, either natural or induced by the vaccine, including ways to indirectly verify it, the impact of immunosenescence on the elicited immunity, or the combined effect of monoclonal antibody and vaccines in potential future therapies. Despite the extraordinary complexity of the immune system dynamics, the progress of simulation platforms suggests that a more intense interaction between clinicians and researchers in the computational model could bring these models to the desired quality for deployment in the medical field.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

AUTHOR CONTRIBUTIONS
FC and AL conceived the work. DD provided data. PL and AS provided useful insights and comments. All authors contributed to the writing of the manuscript. All authors contributed to the article and approved the submitted version.