Predicting COVID-19 pandemic waves including vaccination data with deep learning

Introduction During the recent COVID-19 pandemics, many models were developed to predict the number of new infections. After almost a year, models had also the challenge to include information about the waning effect of vaccines and by infection, and also how this effect start to disappear. Methods We present a deep learning-based approach to predict the number of daily COVID-19 cases in 30 countries, considering the non-pharmaceutical interventions (NPIs) applied in those countries and including vaccination data of the most used vaccines. Results We empirically validate the proposed approach for 4 months between January and April 2021, once vaccination was available and applied to the population and the COVID-19 variants were closer to the one considered for developing the vaccines. With the predictions of new cases, we can prescribe NPIs plans that present the best trade-off between the expected number of COVID-19 cases and the social and economic cost of applying such interventions. Discussion Whereas, mathematical models which include the effect of vaccines in the spread of the SARS-COV-2 pandemic are available, to the best of our knowledge we are the first to propose a data driven method based on recurrent neural networks that considers the waning effect of the immunization acquired either by vaccine administration or by recovering from the illness. This work contributes with an accurate, scalable, data-driven approach to modeling the pandemic curves of cases when vaccination data is available.


Introduction
The COVID-19 pandemic was the first pandemic for which data related to the number of infections, deaths, hospitalizations, and other relevant variables were captured and reported daily in over 100 countries in the world (1,2).Data scientists across the globe, working with mathematicians and epidemiologists, developed computational models to predict the pandemic spread using a variety of approaches, including compartmental meta-population (e.g., SIR or SEIR) (3)(4)(5)(6), statistical (7-10), agent-based (11)(12)(13)(14), and deep learning-based (15)(16)(17)(18) models.These models consider the impact of the applied non-pharmaceutical interventions (NPIs) and thus enable running simulations of what-if scenarios where different NPIs were to be applied.
The SARS-CoV-2 outbreak in Wuhan was made public on 31 December 2019.Its impact and spreading potential were early noticed (19), and the virus genome was sequenced at an early stage of the pandemic spread, showing its most remarkable features (20).The first vaccines were quickly developed due to a concerted effort by pharmaceutical companies, scientists, and governments.Clinical trials started in recorded time after the coronavirus pandemic was declared (21)(22)(23)(24).This allowed the first vaccine doses to be available at the end of 2020 and the beginning of 2021 (25)(26)(27).
Estimating the immunity provided by the different vaccines before setting up vaccination plans was critical in preventing the spread of the infection and in estimating the reduction of the breakthrough infection and other indirect effects.These estimations across different population groups led to the proposal of specific vaccination strategies (28).Another factor to consider is each vaccine's decrease in immunity over time (29,30).
Several mathematical models that leverage such information have been proposed to forecast the evolution of the pandemic under different vaccination policies worldwide, such as (15,31).Immunity can be estimated in terms of confidence intervals, but, as described later, the waning in immunity may be modeled through Weibull distributions (32).
However, we are not aware of any deep learning-based approach to predict the evolution of the COVID-19 pandemic while considering the impact of vaccination.In this study, we present a deep learning-based COVID-19 case predictor that includes vaccination data and thus extends the previous study by (17,18).
We empirically test different implementations with data from the first quarter of 2021 when vaccines started to be available.At that time, the predominant variants of SARS-CoV-2 were Alpha, Beta, and Gamma, which were closer to the variant considered to develop the vaccines than the Delta variant.
This study is organized as follows: In Section 2, we present the notation and the core computational epidemiological models used by our predictor.The data sources used for this study are described in Section 3. Section 4 presents the deep learning-based architecture that we used to implement the different models to predict the number of daily COVID-19 cases.Section 5 summarizes our results, followed by our conclusion in Section 6.

Computational epidemiological model . Notation
We will use the following terms and notation as per (17).Given an arbitrary country denoted by GEO j , we assume that its population is constant and denoted by P j .Its daily number of new COVID-19 confirmed cases on the n − th day, starting from 1st September 2020, will be denoted by X j n .In our estimations, we will consider the smoothed averaged number of cases between the days n − K + 1 and n, computed as Beyond the number of infected individuals on the n-th day at GEO j , we also consider S j n , the number of susceptible individuals who can be infected on the n-th day; V j n , the number of individuals protected by a vaccine on the n − th day; and D j n , number of retired (recovered or deceased) individuals in GEO j on the n − th day.We compute the ratio of cases between 2 consecutive days as .This last quotient captures the effects of a finite population, as it depends on the proportion of susceptible individuals.
We denote the estimations provided by our models with a • symbol, e.g., Xj n denotes the estimated number of new COVID-19 cases on the n-the day in GEO j , and Rj n the estimated scaled case ratio.Next, we present the two underlying computational epidemiological models in which our deep neural network models are based.

. Compartmental SIR model
The classic compartmental metapopulation SIR model computes the number of Susceptible (S), Infected (Z), and Recovered (D) individuals as per the following differential equations: where β is the infection rate, µ is the recovery or removal rate, and σ (D) is a function of the retired individuals.This term is not usually included in basic SIR model formulations, but as the pandemic evolved, it is necessary to include it.The infection rate β, and thus R j n , depend on the transmissibility rates of the different variants circulating in GEO j at time n and on the applied non-pharmaceutical interventions (NPIs) at GEO j.During the period under consideration, there were several variants of concern (VOC) (Alpha, Beta, and Gamma) which changed to variants being monitored (VBM) in September 2021 due to the emergence and expansion of the Delta variant since June 2021 (33).We assume that the three VBM variants behave as a single one.As explained below, the effect of β and µ will be captured jointly in R j n , thus estimating them individually is not necessary.

. Compartmental SIR model with vaccination (SVIR)
The previous SIR model can be extended to incorporate information regarding the level of vaccination in each GEO and the efficiency of the vaccines.It is given by the following equations: This model has two additional terms with respect to the previous one: α(P), which represents the daily vaccinated population, and γ (V), which is a function indicating the vaccinated population that becomes susceptible to the virus due to the waning effect of the vaccines.
From the discrete version of dZ dt , either in 2 or 6, Z j n , the number of infected individuals on the n-th day in GEO j is given as follows: where S j n−1 and Z j n−1 are the numbers of susceptible and infected individuals GEO j on the day n − 1, β is the infection rate, and µ as the recovery or removal rate, which yields the scaled case ratio, R j n as in (16, 17): Given that µ is constant in (10), the larger the infection rate β is, the larger the R j n will be.If we predict R j n , we can estimate the number of COVID-19 cases for the n-th day in GEO j as follows: It is worth mentioning that Z j n is the resulting smoothed number of infected people on GEO j over 7 days, from n − 6 up to n.Moreover, X j n−7 is the real number of infected people on the day n − 7 in GEO j .While X j n is given by the same expression both in the SIR (1) and SVIR (4) models, the estimation of the number of susceptible individuals, S j n , is different due to the vaccination.In the case of the SVIR model, the total population P j for GEO j is given by P j = S j n + V j n +Z j n +D j n , for any n ∈ N, indicating that the total population on GEO j is split on day n as the sum of the susceptible (S ), and removed individuals (D j n ), including both immunized and deceased individuals.Thus, discretizing dS dt , the number of susceptible individuals on the n-th day in GEO j , denoted as S j n , can be obtained as follows: where α(P) The impact of the loss on immunity by part of the population is complex and hard to infer, as it depends on the types of vaccines delivered in each GEO, the distribution of variants with their respective infection rates, the number of doses administered, and the number of partial and fully vaccinated individuals (34).For instance, the Alpha variant was predominant with respect to the Primal variant between January 2021 and June 2021, when the Delta variant became a variant of concern (33).In our experiments we assume that: (1) All circulating SARS-CoV-2 variants are a unique variant during the entire period of study; and (2) All vaccines impact individuals equally, independently of their age, gender, or ethnicity, given that such information is not available in the compartmental metapopulation models. .

Decay over time in the vaccine's immunity to SARS-CoV-infections
The decay of the vaccine's immunity against a SARS-CoV-2 infection may be fitted using a Weibull or a lognormal model.Both of them estimate a similar average protection, but the Weibull model provides a slightly better fit over time (32).The waning effect of the vaccine's immunity on day n is modeled by the means of a Weibull distribution of parameters k and ρ for the following eight vaccines: 1. ChAdOx1 (Oxford/Astrazeneca, OA) 2. Ad5-nCoV Convidecia (Cansino, CA) 3. mRNA-1273 (Moderna Biotech, MO) 4. BBIBP-CorV (Sinopharm, SP) 5. CoronaVac (Sinovac, SV) 6. Sputnik V/Gam-COVID-Vac (Gamaleya, GA) 7. Ad26.COV2.S (Janssen, JA) 8. BNT162b2 (Pfizer/BioNTech, PB) We denote by the complement of the Weibull distribution that models the waning effect on day n of each of the eight vaccines listed above.These models are known as accelerated failure time models and are frequently used in survival analyses.We use the same fitting parameters λ i and k i as those reported in the study mentioned in the reference (32, Table 4).As shown in the Table 1, the parameter estimates are available for individuals who are vaccinated with either a complete or incomplete dose and for actively infected individuals.
Figure 1 shows the Weibull functions that model the probability of immunity for infected and fully vaccinated individuals and for each of the eight vaccines.
In Equation (13) and in the rest of the formulas, the index i = 0 represents the already infected population; i ∈ [1,8] denotes each one of the eight vaccines, following the order in which they are listed above.We assume that: (1) protection starts on the 14th day after the last -complete or partial-dose; and (2) individuals can get reinfected after d 0 = 14 days.Given these assumptions, the number ./fpubh. .
of infected individuals who become susceptible again on GEO j and on day n is given as follows: for n ≥ d 0 + 1, where λ 0 = 87.3 and k 0 = 1.4 as per (32).
The number of vaccinated individuals that become susceptible after waning immunity is computed as follows: for n ≥ n 0 + d 0 = 363, where V i s is the number of individuals that were vaccinated on day s with vaccine i; v indicates whether individuals are partially (p) or fully vaccinated (f); and n 0 corresponds to 14 December 2020 (349th day of the year) plus d 0 days of latency until individuals may get infected again when the vaccination started worldwide.

Data sources
The number of infected and vaccinated individuals and the non-pharmaceutical interventions (NPIs) applied in each GEO of interest were retrieved from the Oxford COVID-19 Government Response Tracker (OxCGRT) (35).If a country has a negative number of cases in 1 day, we replace this number with 0. The input to the prediction model is the smoothed number of cases obtained by computing their average over 7 days.
Table 2 shows the NPIs considered in this study.They are categorical variables that indicate the level of intensity of applying each NPI: the higher the level, the more restrictive the applied measure is.Detailed information about these levels can be found in the codebook of the OxCGRT (35) and in the Supplementary material of (17).
One of these NPIs (H7) describes the population groups that are covered by vaccination with the following levels: (0) vaccines are not available; (1)(2)(3) vaccines are available to one or more of the following groups (indicating the number of them): key workers, clinically vulnerable groups, and older individuals; (4) vaccines are available for broader groups; and (5) vaccines are universally available.The complete description of each NPI can be found at the study mentioned in the reference (36).All the predictor models described in this study consider all confinement (C1 to C8) and some public health interventions (H1 to H3 and H6).The vaccination NPI (H7) may be used to incorporate vaccination into an SIR model or complement an SVIR model, as explained below.
The number of administered vaccine doses per GEO and day is obtained from the OxCGRT dataset.However, this information  See the codebook of the OxCGRT (35) for the NPI-level description associated with each categorical value.
is not provided per vaccine type.We obtained the vaccine specific details from the study mentioned in the reference (2, 37) but only for the following GEOs: Argentina, Austria, Belgium, Bulgaria, Canada, Croatia, Cyprus, Czech Republic, Denmark, Ecuador, Estonia, Finland, France, Germany, Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, the Netherlands, Norway, Poland, Portugal, Slovak Republic, Slovenia, Spain, Sweden, Switzerland, and United States.In the following, we refer to these countries as GEOs, being GEO j the j-th country in this set.Once we have defined the underlying computational epidemiological models and described the data sources, the next sections present the different implementations of the deep learning-based predictor of daily COVID-19 cases and their evaluation with real data.

Predictors of COVID-cases with vaccination . Basic architecture
We base our predictor architecture in the architecture presented in the study by (17).It consists of two parallel branches of bidirectional Long Short Term Memory layers (LSTM) (38), as shown in Figure 2: one (top branch) to predict R j n , i.e., the COVID-19 infection rate in GEO j on day n (context), and the other (bottom branch) to model the effect of the applied NPIs (actions), A j n .Each LSTM provides separate predictions from the context, denoted by h and actions, denoted by h, combined using a lambda layer to yield an estimated Rj n .From Rj n , the number of daily cases is computed as per Equation (11).While we obtain a model for all the GEOs, for conducting predictions on each GEO, we use its own context and action data.The model is implemented in TensorFlow and Keras, running in a computer with an RTX 3090 GPU with 24 GB of RAM.
The architectural details of each of the branches are as follows: 1.The context branch (top) consists of a one dimensional convolutional layer with the ReLu activation function, followed by a maxpool layer with pool size equal to two, and a bidirectional LSTM followed by a dense layer.The convolutional layer has 64 filters with kernel of size 8, and the bidirectional LSTM with 32 units encodes the input sequence into states of 32 dimensions, which are then provided to the dense layer for prediction.This architecture empirically generalized well for many GEOs, achieving good performance in both short-and long-term predictions (17).The outcome of this layer is denoted by a function h in terms of the ratios of cases R j n .2. The action branch (bottom) consists of an LSTM followed by two dense layers to capture non-linearities.We use a sigmoid activation function to ensure the output is in the [0, 1] range.The outcome of this layer is denoted by a function g(A) in terms of the NPi's A j n applied in the GEO j.Moreover, we constrain g(A) to satisfy the following condition: if the difference between two sets of actions A and A ′ is greater than or equal to 1, (1 − g(A)) must be lower or equal to (1 − g(A ′ )). 3. Finally, a lambda layer combines the outcomes of the context and action branches and provides the predictions of R j n that permits estimating future cases.

. Enhanced models with vaccination
We introduce two key modifications with respect to previously described basic model.First, the rapid expansion of the Alpha/Delta/Omicron variants enables learning a context model for all GEOs simultaneously instead of clusters of countries.Second, instead of a traditional SIR model, we include vaccination information in two ways: (1) through an NPI (H7) as an action in the action branch or (2) with an SVIR model that considers the effects of vaccination.The hypothesis is that the SVIR model would yield more accurate predictions once vaccinations are widespread, as it considers the protective effect of vaccination.Nevertheless, as time goes by, the probability of reinfection increases, it is necessary to include waning immunity in the models.
We compare eight different predictors.First, we use the baseline model (Baseline 1) introduced in the study mentioned in the reference (16) and served as a baseline for the XPRIZE Pandemic Response Challenge.We also benchmark our proposed models against a second baseline model (Baseline2), the predictor presented in the study mentioned in the reference (17) but without performing any clustering of GEOs as we only consider the 30 GEOs, where vaccination data were available as opposed to 198 GEOs.With such a limited number of GEOs, a clustering process is unsuitable.In neither of these predictors, there is no reintroduction of infected individuals who have lost immunity.
In addition, we consider six predictors to test the different implementations of vaccination data.

FIGURE
Given the previous contexts R j n− and actions A j n− on GEO j up to the n − -th day, the model computes an estimated Rj n which is the infection rate at n-th day for GEO j as a result of combining both branches with a lambda function.All the predictors consider a waning immunity of infected individuals.These are the models under consideration, according to the nomenclature used in  Notably, the SIR model only allows to include vaccination by adding NPI H7.In our experiments, we compare these predictors with real data in the 30 GEOs of study.
The input to the models consists of data from previous confirmed cases and the NPIs implemented in each GEO.The NPIs are represented as a vector of categorical values that indicate the strength of each of the interventions, as previously explained.

Results
In this section, we first present the results of testing the previously described models to predict the number of COVID-19 cases globally between January and April 2021.We train the predictor with data retrieved from OxCGRT data set to predict the daily COVID-19 cases for the aforementioned list of GEOs between 1st September 2020 and 30th April 2021.All models were trained starting in 1st September 2020 until the day before the first prediction day.The models have a cumulative error since the prediction for the first day is used to make the prediction for the second one.In our experiments, we observed that for prediction periods longer than a fortnight, the error in the predictions started to increase significantly.Thus, we trained a new predictor every 15 days in the testing period and tested it to predict the number of COVID-19 cases in the next 14 days.After summarizing, to predict the number of newly infected individuals on day d 0 , the models are trained with data up to d 0 − 8.We run five simulations to predict the number of new infections for d 0 − 7 to d 0 − 1 days.We select the model with the lowest mean absolute error (MAE) and use it to predict the number of COVID-19 cases for the period d 0 to d 0 + 13.To prevent overfitting, we use a validation data set at each epoch at training and an early stopping callback such that when the validation MAE stops decreasing, the training process is also stopped.Table 3 shows the MAE and Mean Rank of all the models, including the baselines ones.Notably, the MAE is normalized by 100,000 inhabitants to enable a fair comparison across GEOs independently of the population size.To compute the Mean Rank, the models are ranked on each GEO and period, assigning 1 to the best-performing model and a 7 to the worst-performing model.The mean of all ranks on all GEOS is computed to obtain each predictor's Mean Rank.4); France during February 2021, when cases were stabilized (Figure 5); Ireland during March 2021, when cases tended to decrease or stabilize (Figure 5); and Italy during April 2021, when there were two peaks of infections (Figure 7).Let us note how the H7& VacW SVIR predictor is able to correctly capture the trends in the pandemic curves even with such diversity of situations of the pandemic.

Conclusion
In this study, we have presented a deep learning-based predictor of COVID-19 cases in 30 countries that considers both the daily Non-Pharmaceutical Interventions applied in each country and vaccination data.It is worth mentioning that despite the abundance of data, it is complex to consider information regarding age groups, doses administered of each vaccine, and the coexistence of different strains with different transmissibility rates, which were different from the primal strain used for designing the vaccines.In addition, the most efficient vaccines were the mRNA-based vaccines, which were the first ones to be designed and massively applied with this technology, and the duration of their effects on individuals from different regions is still under study, which may lead to potential biases (39).
Despite these difficulties and limitations, the proposed approach effectively considers vaccination information in a machine learning-based model that can be applied to different countries to predict the number of COVID-19 cases.Our models have shown a competitive performance over a long time period between January and April of 2021, when the vaccination campaigns started in many countries.Our study illustrates the value of having access to high-quality systematic data during a pandemic to enable evidence-driven decision-making.
All code and files used in this study are available at https:// github.com/AhmedBegggaUA/frontiers_in_public_health.

1 ,
which shows the growth/decrease in the number of cases, and the rescaled ratio by the proportion of susceptible individuals, denoted by R

j n− 1
represents the total number of vaccinated individuals on the day n − 1, γ (V) j n−1 , reflects the vaccinated individuals who have lost immunity on day n − 1, and γ (D) j n−1 corresponds to the infected individuals who have lost immunity on day n − 1.

FIGUREFrontiers
FIGUREWeibull distributions to model the decay e ect of the vaccines (OA, CA, MO, SP, SV, GA, JA, and PB) on infected and fully vaccinated individuals.

6 .
SVIR H7 & VacW: SVIR model that reintroduces infected individuals that lost immunity and considers both NPI H7 and the waning in the vaccines' immunity.

FIGUREFIGURE
FIGUREPredictions from th January to th February of the number of COVID-cases vs. the ground truth (yellow dashed line) for Europe with MAE per , inhabitants.

Figure 3
Figure 3 shows the predictions of the two best-performing predictors (H7 & VacW SVIR and w/o H7 & VacW SVIR) compared with the ground truth (yellow dashed line) and the baseline 2 model (red line), between mid-January and mid-February 2021, immediately after the vaccinations started to have an impact on the spread of COVID-19.Let us note how the inclusion of H7 improves the estimation.

FIGURE
FIGUREPredictions of the number of COVID-cases vs. the ground truth for France with MAE per , inhabitants in February .

Figures 4 -
Figures 4-7show the predictions of the two best-performing predictors on data between January and March of 2021 on several European countries with very different dynamics in the evolution of their number of COVID-19 cases: Poland during January 2021, when cases were increasing (Figure4); France during February 2021, when cases were stabilized (Figure5); Ireland during March 2021, when cases tended to decrease or stabilize (Figure5); and Italy during April 2021, when there were two peaks of infections (Figure7).Let us note how the H7& VacW SVIR predictor is able to correctly capture the trends in

FIGURE
FIGUREPredictions of the number of COVID-cases vs. the ground truth for Italy with MAE per , inhabitants in April .
TABLE NPIs considered in this study and their possible activation values.
TABLE Accuracy of the predictors expressed in terms of MAE and Mean Rank (M.Rank) from st January to th April .Bold values indicate the best results in terms of MAE and Mean Rank.