# Recurrent Neural Network and Reinforcement Learning Model for COVID-19 Prediction

^{1}Department of Computer Applications, Hindusthan College of Engineering and Technology, Coimbatore, India^{2}Dubai Men's College, Higher Colleges of Technology, Dubai, United Arab Emirates^{3}Department of Information and Communication Engineering, Yeung University, Gyeongsan, South Korea^{4}Future Technology Research Center, College of Future, National Yunlin University of Science and Technology, Douliu, Taiwan^{5}Faculty of Civil Engineering, Technische Universität Dresden, Dresden, Germany^{6}John von Neumann Faculty of Informatics, Obuda University, Budapest, Hungary^{7}School of Creative and Cultural Business, Robert Gordon University, Aberdeen, United Kingdom

Detection and prediction of the novel Coronavirus present new challenges for the medical research community due to its widespread across the globe. Methods driven by Artificial Intelligence can help predict specific parameters, hazards, and outcomes of such a pandemic. Recently, deep learning-based approaches have proven a novel opportunity to determine various difficulties in prediction. In this work, two learning algorithms, namely deep learning and reinforcement learning, were developed to forecast COVID-19. This article constructs a model using Recurrent Neural Networks (RNN), particularly the Modified Long Short-Term Memory (MLSTM) model, to forecast the count of newly affected individuals, losses, and cures in the following few days. This study also suggests deep learning reinforcement to optimize COVID-19's predictive outcome based on symptoms. Real-world data was utilized to analyze the success of the suggested system. The findings show that the established approach promises prognosticating outcomes concerning the current COVID-19 pandemic and outperformed the Long Short-Term Memory (LSTM) model and the Machine Learning model, Logistic Regresion (LR) in terms of error rate.

## Introduction

With the spread of the unfamiliar Coronavirus (COVID-19), which was first discovered in Wuhan city in China in 2019, societies worldwide continue to face very distressing times. On March 11, 2020, the World Health Organization (WHO) flagged the COVID-19 as a pandemic, exceeding 118,000 cases in over 110 countries. The epidemic has quickly spread through many countries, including Italy, Spain, France, the United States, and India, wreaking havoc on healthcare systems (1). Modeling and predicting the expanse of verified and recovered COVID-19 cases accurately is critical for understanding and helping decision-makers to slow down or arrest its progression. Since the COVID-19 pandemic has shifted into a global pandemic, there is a necessity for real-time epidemiological data examinations to provide the population with a strong course of action to combat the infection. Following the novel COVID-19, the world has been restlessly battling its cause (2). As of August 27, 2020, there were 24,631,906 confirmed cases worldwide, of which 17,089,939 recovered, and 841,310 ended in death (3). Table 1 shows the topmost countries affected. The COVID-19 relate itself to the species as that of SARS-CoV and MERS-CoV.

Table 2 shows the comparison, where the symptoms initially appear as a common cold then progress to those of respiratory diseases that cause breathing problems, tiredness, fever, and dry cough. Once a large-scale break out of a contagious disease occurs and a significant public health emergency ensues, researchers use outbreak models to evaluate and forecast the disease's development pattern and determine direct measures to prevent and restraint based on the effects of the analysis.

The most frequently used conventional pandemic schemes are susceptible—infected—recovered (SIR), and susceptible—exposed—infected—recovered (SEIR) models (4), where “S,” “E,” “I,” and “R” signify every number of susceptive persons, the magnitude of individuals during the incubation phase, the magnitude of contagious persons and the number of individuals improved, respectively. These models are trained to forecast multiple diseases, such as Ebola and SARS, due to their robust predictive abilities of the linked indications. With the emergence and spread of COVID-19, the significant research challenge is arresting the growth patterns of the spread of this disease which has been observed in several science fields throughout the globe. Thereby, different approaches (5) to modeling, estimating, and forecasting are implemented to understand and control this pandemic. Traditional disease models measure the rate of infection based on the complex variation in the number of contaminations and then determine the disease's spread and evolution pattern. Yet, those approaches assume that all individuals with Coronavirus hold an equal chance of infection, and hence, their predictive results can only suggest general patterns and are restricted.

Artificial Intelligence (AI) is lately being applied toward stimulating biomedical study and toward numerous fields such as image identification, object categorization, image segmentation, and deep learning approaches (6). For example, individuals affected with COVID-19 will possibly have pneumonia since the infection reaches the lungs. Many deep learning investigations identify the condition using X-ray images of the chest (7). Three different deep learning models (8) have been employed in the past to distribute X-ray images of pneumonia, and those are the fine-tuned model, the non-fine-tuned and the scratch-trained model. On the other hand, most prediction models use X-ray and CT images (9) based on the deep learning method, which requires more time to extract the features and train the model.

Famous classical mathematical differential equations and population prediction models have limitations on predicting the population in the time-series and significant estimation errors. Analytical methodologies, for instance, Auto Regressive Moving Average (ARIMA), Moving average (MA), and Auto-Regressive (AR) methods, are primarily formulated on the premises. Still, they have difficulties in predicting live circulation rates. A vast variety of demographic and computative models (10) were developed for modeling COVID-19's rampant transmission dynamics. However, in multiple situations, these approaches don't adhere to the provided information, and the accuracy of the forecast is usually low. Therefore, this work investigates the modified LSTM approach to forecasting the likely COVID-19 cases and deaths. It also describes deep reinforcement learning for optimizing the prediction results based on symptoms. Experiments using real data and various metrics reveal the improved performance of the work. The specific contributions of this paper include:

• Deep and reinforcement learning to predict COVID-19,

• LSTM model modified with new activation function for efficient prediction,

• Deep reinforcement learning applied to optimize results based on COVID-19 symptoms

The remainder of the paper is ordered as follows. Section related work reviews the related work. Preliminary information regarding the used approaches and the problem statement is given in section methodological preliminaries, while section optimized prediction of covid-19 describes the proposed method. The experiential details, evaluation criteria, and performance comparison are given in section optimized prediction of covid-19, along with an analysis of the conclusions attained. Lastly, section results and discussions gives some concluding observations.

## Related Work

### COVID-19 Prediction and Forecasting

Various prediction techniques that are regularly used to tackle forecasting problems include Machine Learning (ML) models (4), which can be employed toward determining the number of potential COVID-19 infected patients. Rustam et al. (4) used four simple statistical models: Linear Regression, Least Absolute Shrinkage and Selection Operator, Support Vector Machine, and Exponential Smoothing to forecast threatening COVID-19 factors. Petropoulos and Makridakis (11) presented an analytical procedure to forecast the continuation of COVID-19. The work presents a timeline of a live forecasting activity with significant possible planning and decision-making consequences and offers realistic forecasts for confirmed COVID-19 cases. Malavika et al. (12) adopted a logistic growth curve model for short-term prediction of COVID-19, and SIR models was employed in identifying the highest possible live individuals and peak seasons. In addition, the Time Disrupted Regression model is used to estimate the influence of lockdown and other important proposals.

Pal et al. (13), combined medical data with the trend and local weather data to forecast each country's level of risk. Specifically, a shallow LSTM neural network is employed in solving difficulties in limited datasets, and a country's risk level (high, medium, and recovery) is categorized using the Fuzzy rule. Hu et al. (14) used Coronavirus-specific dataset to fine-tune the pre-trained multi-task deep model. The re-trained prototype was then utilized to decide possible commercial medications upon targeted proteins of SARS-CoV-2. Finally, Salgotra et al. (9) developed Genetic programming (GP) prediction models for confirmed individuals and death cases across three of the most affected states, namely Maharashtra, Gujarat, Delhi, and India. The predictive models are expressed utilizing the specific formula, and predictive powerless variables were studied.

Velásquez and Lara (1) analyzed historical and expected COVID-19 death infections based upon the Reduced-Space Gaussian Process Regression correlated with disordered Dynamical Systems. COVID-19 forecasted with Gaussian models mean-field models can be meaningfully applied to obtain a quantitative summary of virus spread with contamination, death, and recovery rates. Jia et al. (10) adopted Logistic, Bertalanffy, and Gompertz models to prove the validity of the current statistical models by fitting and analyzing the SARS epidemic patterns. The findings were then used to fit and evaluate the COVID-19 scenarios. The forecasted outcomes of the three different mathematical models varied for different parameters and in different regions. Kavadi et al., (15) proposed partial derivative regression and a non-linear machine learning system toward the global pandemic verification of COVID-19. Dehesh et al., (16) considered the best predictive models for regularly reported individuals in nations with a huge magnitude of verified cases and then made predictions based on those models to better prepare healthcare systems. For predicting the pattern of reported events, the Auto-Regressive Integrated Moving Average model was used. Ngabo et al. (17) proposed an artificial intelligence (AI) algorithm that predicts the survival rate of COVID-19 patients based on their immune system, exercise rate, and age quantiles.

### Deep Learning for COVID-19

Arora et al. (18) used deep learning-based models to forecast the number of recorded positive cases of novel Coronavirus (COVID-19) for 32 Indian states and union territories. Recurrent Neural Network (RNN)-based LSTM variants, such as Deep LSTM, Convolutional LSTM and Bi-directional LSTM, were applied to the Indian dataset to forecast the number of positive cases. Huang et al. (19) suggest that Convolutional Neural Network (CNN) can accurately estimate and determine the number of verified cases. The emphasis was on various towns with the most reported cases in China, and a COVID-19 prognostication model was suggested based upon the CNN system of Deep Neural Network (DNN). Three deep learning models (20), namely DNN, LSTM, and CNN, were stacked in learning models for the ensemble to achieve the most reliable results. The meta-learners used these forecasted values of these models as inputs to produce the final prediction of outbreaks. Ramchandani et al. (21) proposed a deep learning model to forecast the range of increase in COVID-19 and offer an unusual approach for determining equidimensional multivariate time scale illustrations and multivariate spatial time scale results.

Yoo et al. (22) examined the usefulness of applying a deep learning-based decision-tree classifier to distinguish COVID-19 from CXR images. This classifier consists of three binary decision trees, each trained by a deep learning model based on the PyTorch system with a neural convolution network. The primary decision tree divides the CXR images as either regular or anomalous. The second tree recognizes the irregular images bearing symptoms of tuberculosis. The final tree identifies the signs of COVID-19. Ozturk et al. (23) introduced a different approach for automated COVID-19 detection employing raw X-ray images of the chest. The developed system offers honest diagnostics for binary (COVID-19 vs. No-Findings) and multi-class (COVID-19 vs. Pneumonia vs. No-Findings) classification. Panwar et al., (24) proposed a deep learning neural network-based approach nCOVnet, which employs an alternative rapid screening system to identify COVID-19 by analyzing patients' X- rays to check for visual markers present in COVID-19 patients' chest radiography images.

Hu et al. (25) proposed the weakly controlled deep learning strategy for recognizing and distinguishing COVID-19 contagion from computed tomography (CT) images. This approach reduces the manual labeling requirements for CT images, reliably diagnose infection, and distinguish COVID- 19 from non-COVID-19 cases. Deep learning-based research about CT in the chest has proven to be reliable and effective for determining COVID-19. Mohammed et al. (26) proposed ResNext+, which offers an end-to-end semi-supervised strategy to COVID-19 discovery, including data labels at volume level only, and can provide a slice level prediction. A deep, long-term bidirectional memory network with a mixed density network (DBM) model was established by Pathak et al. (27), namely the Memetic Adaptive Differential Evolution (MADE) algorithm which can fine-tune the hyperparameters for the DBM model.

### Deep Reinforcement Learning

Reinforcement Learning (RL) is a machine-learning model, where agents learn efficient techniques from trial-and-error encounters with their surroundings that produce the single most massive, long-term reward. The Q-learning algorithm (28) is the most descriptive of the RL algorithms. Q-learning can learn an acceptable method without an environmental operating prototype by modifying an action-value algorithm called the Q function. When the state-action space is large and complex, deep neural networks can approximate the Q-equation, and the corresponding algorithm is called Deep Reinforcement Learning (DRL) (29). This has promising application for rational decision-making in diverse fields, such as energy management, robotics, agriculture, healthcare, etc. This model successfully resolved a wide range of complicated decision-making assignments which were earlier outside the machine's limits.

Wang et al. (30) offered an adaptive design that relies on graph embedding in the training process during state representation and reinforcement learning. Depending on a couple of real-life datasets, the findings show that the scheme can beneficially decrease the infection's epidemiological replication rate. This approach can aid in the initial exposure of COVID-19, whereby RL may represent an effective method to combat the spread of an outbreak. Dell'Aversana (23) combines multi-layer Artificial Neural Networks with Reinforcement Learning architecture to allow software-defined factors to acquire environmentally optimized functioning. Iwendi et al., (31) utilized COVID-19 patients' geographical and demographic data to predict the severity of cases, recovery, and death. In (32) a semantic privacy framework that uses sensitive and semantically related terms to sanitize healthcare documents was proposed, and (33) uses deep learning to detect and sanitize social media comments. In (34), the authors discussed the concepts of an incentive approach for COVID-19 planning using Blockchain Technology. Deep learning and medical image processing for Coronavirus (COVID-19) pandemic were analyzed by the authors in (35) and the results are nicely presented.

## Methodological Preliminaries

This section describes the COVID-19 prediction problem and gives information about the technical background of DL and RL used within this work.

### Problem Formulation

Time scale forecasting aims to utilize the input sequences witnessed earlier to forecast a fixed-length series of expected time scale values. In machine learning, a part of the input time-series sequence, i.e., delayed values, is replaced to assist the input functions. The number of leading time levels is recognized as the width/size of the frame. Provided with a single variable time-series:

the intention is to forecast the future k values of the sequence, ŷ = ŷ_{1}, ŷ_{2}, ŷ_{3}, …, ŷ_{k} ≅ (*s*_{t+1}, *s*_{t+2}, *s*_{t+3}, …, *s*_{t+k}) utilizing the values of former conclusions.

### Long Short-Term Memory Network

RNN is one of the deep learning techniques which automatically selects appropriate characteristics from the practice specimens and later supplies activation from the previous time step as information for the current time step and network's self-connections. RNN is proper for data processing and has outstanding potential in time-series forecast by saving extensive historical data in its inner state. Still, it has the limitation of disappearing and gradient-exploding difficulties, which leads to an extended practice period or practice that does not work. In 1997, Hochreiter and Schmidhuber (36) devised a long short-term memory structure to determine long-term dependence on the multiplicative passages that coordinate information and memory cells movement in the recurrent hidden layer.

Figure 1 exhibits an LSTM memory cell's primary arrangement with two distinct components: (C_{t}) and the short-term state component (h_{t}).

The structure of LSTM consists of the following gates: input, forget, control, and output gate. The input gate determines which information can be passed on to the cell and is defined as:

The information to be ignored from the previous memory input is determined by the forget gate and is defined as:

The cell update is controlled by the control gate, based on the following equations:

The hidden layer (*h*_{t−1}) is updated by the output layer, which is also responsible for updating the output gate as is given by:

In the above equation, *W* and *b* represent the weight matrix and bias vector, respectively; tanh is used to scale the values in the range −1 to 1, and σ denotes a standard logistic sigmoid function. The variables *i, f, o*, and *c* are the input gate, forget gate, output gate, and cell activation vector.

### Reinforcement Learning

Reinforcement learning (RL) is an artificial intelligence model with a progressive programming guide instructing algorithms to apply an award and penalty strategy. A Markov choice system is designated for the RL study process, which endorses the formalism of reinforcement learning difficulties. The RL algorithm is an agent that receives through communicating and associating with society. The agent will obtain incentives for the appropriate steps and penalties for the inaccurate performances. Without individual intervention, the agent learns by itself via improving his incentives and reducing his punishments. The intercommunication process between the agent and environment of RL is shown in Figure 2.

An agent available in a state (S) executes an action (A). The agent collects a reward R(S, A) for acting and shifts into a different state. A policy (π) means any mapping function of the states and actions that decide an agent's action in every state. The central goal in an agent's existence is to obtain an optimal strategy π^{*} for which the overall reduced incentive is expanded. The optimal system π^{*} is defined in equation (7):

The best assessment function is obtained from the most dependable optimal plan; it is defined through some benefits an agent receives from all the other states. The aforementioned function with optimum power is expressed in equation (8):

Thus, a reinforcement learning agent learns through experiences with the environment. Using complex programming functions, agents optimize their compensations through estimating individual fittest optimal strategy and power function for the bellman.

## Optimized Prediction of COVID-19

The presented research investigates predictions for the novel COVID-19 using the discussed reinforcement learning. COVID-19 has proved to be a serious threat to people of all ages all over the world. It has resulted in tens of thousands of deaths so far, with the mortality rate continuing to rise regularly. This research sought to assess future estimates of the death rate, the number of infected individuals reported every day and the number of healing cases in the following 15 days to add to the pandemic's tracking. The forecasting was done using a Deep Learning framework that was designed specifically for this study.

AI is a growing platform with many different intelligent applications, such as Machine Learning (ML) and Deep Learning (DL). ML indicates the capability to acquire and deduce significant patterns of these data; furthermore, ML-based algorithms and practices' accomplishment depends heavily on the individual functions. Meanwhile, by learning from simple representation, DL can solve complex systems. DL possesses a pair of critical features (14): (1) the capacity to acquire the correct phrases and (2) the ability to help the machine discover data by sequentially using several layers to understand more meaningful representations.

This paper proposes an optimized prognostication of COVID-19 employing deep reinforcement learning (DRL). Primarily, Modified-LSTM was utilized to forecast the figures of verified cases and death cases. Next, the forecasted outcomes were optimized by employing DRL based on the symptoms. Figure 3 shows the proposed workflow.

LSTM is an RNN deformation structure, where the memory cell is added to the hidden layer to restrain the memory data of the timeline data. Data is transmitted through several controllable gates between different cells of the hidden layer, thus allowing control of the memory and ignoring the amount of the former and prevailing data. Two LSTM gates are designed to control the status of the memory cell. One is the forget gate, which shows how much “memory” can be preserved from the last moment of the cell; the second is the input gate, which specifies how much information about the current time can be kept to the cell status and regulates the fusion ratio of “old” data and “present” incentive. Lastly, LSTM's output gate is designed to control the extent of cell status information outputs. The structure of LSTM can be found in Figure 1.

Herein, two activation functions of LSTM are used: linear and non-linear. The traditional LSTM uses the non-linear tanh function. For best results, select the best activation function.

Deep learning stimulates the improvement in RL, whereby DL algorithms within RL describe the deep reinforcement learning (DRL) area. Deep learning allows RL to scale up the earlier unmanageable issues, i.e., settings by a high-dimensional state, areas for the interruption, and decision-making. Deep reinforcement training employs a deep neural network to approximate every reinforcement learning function, including value function, Q function, transformation system, and reward function. Q-Learning is an RL system that decides which action an agent should take, depending on an action-value role. This determines the significance of remaining in a particular state and completing a specific activity at that state depending on an action-value role.

It is one of the most meaningful progress in reinforcement learning by developing an algorithm to limit off-policy temporary deviation. Q-Learning measures a state-action value function for a target system that decides the highest value in picking the action. Function Q accepts the information as a current state (S) and action (A) and returns an estimated reward for that action in that state. Thus, q functions provide the arbitrary fixed values in the beginning before investigating the situation.

In this work, the following symptoms were considered to predict COVID-19 cases: fever, tiredness, dry cough, difficulty breathing, sore throat, pains, nasal congestion, runny nose, and diarrhea. Groups of symptoms are considered states, and action is taken based on the states. Here, the action is taken based on the increments in the confirmed and death cases. A reward is received when the foresight of verified cases and death cases are accurate. The action-value function defined as:

where q (s, a) represents the neural network approximation and θ is a reference variable representing the network's edge weights. The input to the neural network is a state, and the outputs for unconnected activities *Q={q(s,a*,θ*)|a*ϵ*A}s* are approximate *q* values.

The system is trained by depreciating forecast faults of *q*(*s, a*; θ). The DRL agent taking action at time t is *a*_{t} = argmax_{a}*q*(*s*_{t}, *a*; θ), where *q*(*s*_{t}, *a*; θ) for various activities are supplied through the outcomes of the network. For example, assume that the resulting compensation is r_{t+1} and the state move to s_{t+1}, then various actions are provided by the outputs of QNN. Consider that the resulting reward is r_{t+1} and the state moves to s_{t+1}, then (s_{t},a_{t},r_{t+1},s_{t+1}) establishes an “experience sample” that might be used to train the network. For training, the prediction error of network for the particular experience sample (s_{t},a_{t},r_{t+1},s_{t+1}) is defined as:

Where θ = *weights*; and (*y*_{rt+1}, *s*_{t+1}) = *targetoutput*, which is defined as:

## Results and Discussions

This segment displays the trial setup and evaluation manner of the suggested prognostication design. Data of COVID-19 cases in India from Kaggle were employed with the Java framework to authenticate and examine the recommended standard.

The dataset utilized in the report comprises summary tables of regular time scales, including the number of reported cases and deaths over the past number of days from which the pandemic began. Data from January 30, 2020 (when the first case of COVID-19 was registered in India) to August 16, 2020, were analyzed, with 75% data employed for practice and 25% for predictive and validation purposes. Table 3 shows the sample data of confirmed and death cases daily and weekly.

Figure 4 shows the comparison of confirmed cases for the original and predicted values for 200 days. Here, we could see a high correlation between the actual and predicted cases.

Figure 5 shows the comparison of death cases for original and predicted values. It can be seen that the death rate slightly increased on specific days.

Figure 6 shows the estimated confirmed cases for the next 15 days. Figure 7 displays the estimated death cases for the next 15 days.

The proposed algorithm can evaluate the performance of learning in terms of the following metrics: Mean Absolute Error (MAE), Mean Square Error (MSE), and Root Mean Square Error (RMSE). Figure 8 shows the comparison of evaluation metrics. Based on the outcomes, the recommended MLSTM-DRL has a lowest error rate matched to other systems. We also see the ML model, Logistic Regression (LR) obtained the highest error rate. Deep learning methods are utilized to develop a system for future prediction of the COVID-19 affected cases. The study performs predictions on confirmed and death cases. It is a troubling circumstance for the world day by day as death and reported cases are rising. The number of individuals in various countries affected by the COVID-19 pandemic is not well-known. This analysis attempts to estimate the number of individuals who will be affected over the next 15 days in terms of freshly authenticated cases and deaths.

Individuals being affected by COVID-19 get increased daily, and the death rate has many ups and downs. The correlation between original and predicted data of confirmed and death cases are 0.999. The future prediction results can assist governments in planning lockdown or other medical decisions.

## Conclusion

COVID-19 is an ongoing pandemic that significantly endangers the health of people worldwide in a short period. A DL-based prediction method for forecasting the risk of COVID-19 has been proposed in this work. The framework analyses the actual day-to-day data dataset and uses deep learning algorithms to make predictions for upcoming days. This study determines the best activation function for M-LSTM; specifically, a deep reinforcement learning algorithm to optimize the prediction results. The proposed approach was compared with widely used existing algorithms like LR and LSTM. The finding of this work proves that the DL method can efficiently predict future cases of COVID-19. Overall, it can be concluded that the model's predictions are at per with the status of the virus; this may help understand and curb the spread of the virus.

Therefore, this study's forecast may be of great help in taking timely actions and making decisions to tackle the COVID-19 crisis. In the future, we advise using a semi-supervised hybrid design to identify COVID-19 and social media platforms to prevent further spread. It is also planned to publish the predicted results as a dashboard through Google Data Cloud.

## Data Availability Statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding authors.

## Author Contributions

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

## Funding

Open Access Funding by the Publication Fund of the TU Dresden.

## Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

## Publisher's Note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

## References

1. Velásquez RM, Lara JV. Forecast and evaluation of COVID-19 spreading in USA with reduced-space Gaussian process regression. *Chaos Solitons Fractals*. (2020) 136:109924. doi: 10.1016/j.chaos.2020.109924

2. Punn NS, Sonbhadra SK, Agarwal S. COVID-19 epidemic analysis using machine learning and deep learning algorithms. *MedRxiv*. (2020) 29:105340. doi: 10.1101/2020.04.08.20057679

3. *CoronaBoard: COVID-19 Dashboard*. Available online at: https://coronaboard.kr/ (accessed July 22, 2020).

4. Rustam F, Reshi AA, Mehmood A, Ullah S, On BW, Aslam W, et al. COVID-19 future forecasting using supervised machine learning models. *IEEE Access*. (2020) 8:101489–99. doi: 10.1109/ACCESS.2020.2997311

5. Yang Z, Zeng Z, Wang K, Wong SS, Liang W, Zanin M, et al. Modified SEIR and AI prediction of the epidemics trend of COVID-19 in China under public health interventions. *J Thorac Dis*. (2020) 12:165–74. doi: 10.21037/jtd.2020.02.64

6. Togaçar M, Ergen B, Cömert Z. Application of breast cancer diagnosis based on a combination of convolutional neural networks, ridge regression and linear discriminant analysis using invasive breast cancer images processed with autoencoders. *Med Hypotheses*. (2020) 135:109503. doi: 10.1016/j.mehy.2019.109503

7. Jaiswal AK, Tiwari P, Kumar S, Gupta D, Khanna A, Rodrigues JJ. Identifying pneumonia in chest X-rays: a deep learning approach. *Measurement*. (2019) 145:511–8. doi: 10.1016/j.measurement.2019.05.076

8. Baltruschat IM, Nickisch H, Grass M, Knopp T, Saalbach A. Comparison of deep learning approaches for multi-label chest X-ray classification. *Sci Rep*. (2019) 9:6381. doi: 10.1038/s41598-019-42294-8

9. Salgotra R, Gandomi M, Gandomi AH. Time series analysis and forecast of the COVID-19 pandemic in India using genetic programming. *Chaos Solitons Fractals*. (2020) 138:109945. doi: 10.1016/j.chaos.2020.109945

10. Jia L, Li K, Jiang Y, Guo X. Prediction and analysis of coronavirus disease 2019. *arXiv [Preprint]* arXiv:2003.05447. (2020).

11. Petropoulos F, Makridakis S. Forecasting the novel coronavirus COVID-19. *PLos ONE*. (2020) 15:e0231236. doi: 10.1371/journal.pone.0231236

12. Malavika B, Marimuthu S, Joy M, Nadaraj A, Asirvatham ES, Jeyaseelan L. Forecasting COVID-19 epidemic in India and high incidence states using SIR and logistic growth models. *Clin Epidemiol Glob Health*. (2021) 9:26–33. doi: 10.1016/j.cegh.2020.06.006

13. Pal R, Sekh AA, Kar S, Prasad DK. Neural network based country wise risk prediction of COVID-19. *Appl Sci*. (2020) 10:6448. doi: 10.3390/app10186448

14. Hu F, Jiang J, Yin P. Prediction of potential commercially inhibitors against SARS-CoV-2 by multi-task deep model. *arXiv [Preprint]* arXiv:2003.00728. (2020).

15. Kavadi DP, Patan R, Ramachandran M, Gandomi AH. Partial derivative nonlinear global pandemic machine learning prediction of covid 19. *Chaos Solitons Fractals*. (2020) 139:110056. doi: 10.1016/j.chaos.2020.110056

16. Dehesh T, Mardani-Fard HA, Dehesh P. Forecasting of covid-19 confirmed cases in different countries with Arima models. *medrXiv*. (2020).

17. Ngabo D, Dong W, Ibeke E, Iwendi C, Masabo E. Tackling pandemics in smart cities using machine learning architecture. *Math Biosci Eng*. (2021).

18. Arora P, Kumar H, Panigrahi BK. Prediction and analysis of COVID-19 positive cases using deep learning models: a descriptive case study of India. *Chaos Solitons Fractals.* (2020) 139:110017. doi: 10.1016/j.chaos.2020.110017

19. Huang CJ, Chen YH, Ma Y, Kuo PH. Multiple-input deep convolutional neural network model for covid-19 forecasting in china. *medRxiv*. (2020).

20. Yahia NB, Kandara MD, Saoud NBB. Deep ensemble learning method to forecast COVID-19 outbreak. (2020).

21. Ramchandani A, Fan C, Mostafavi A. Deepcovidnet: an interpretable deep learning model for predictive surveillance of covid-19 using heterogeneous features and their interactions. *IEEE Access*. (2020) 8:159915–30. doi: 10.1109/ACCESS.2020.3019989

22. Yoo SH, Geng H, Chiu TL, Yu SK, Cho DC, Heo J, et al. Deep learning-based decision-tree classifier for COVID-19 diagnosis from chest X-ray imaging. *Front Med*. (2020) 7:427. doi: 10.3389/fmed.2020.00427

23. Ozturk T, Talo M, Yildirim EA, Baloglu UB, Yildirim O, Acharya UR. Automated detection of COVID-19 cases using deep neural networks with X-ray images. *Comput Biol Med*. (2020) 121:103792. doi: 10.1016/j.compbiomed.2020.103792

24. Panwar H, Gupta PK, Siddiqui MK, Morales-Menendez R, Singh V. Application of deep learning for fast detection of COVID-19 in X-Rays using nCOVnet. *Chaos Solitons Fractals*. (2020) 138:109944. doi: 10.1016/j.chaos.2020.109944

25. Hu S, Gao Y, Niu Z, Jiang Y, Li L, Xiao X, et al. Weakly supervised deep learning for covid-19 infection detection and classification from ct images. *IEEE Access*. (2020) 8:118869–83. doi: 10.1109/ACCESS.2020.3005510

26. Mohammed AK, Wang C, Zhao M, Ullah M, Naseem R, Wang H, et al. Semi-supervised network for detection of COVID-19 in chest CT scans. *IEEE Access.* (2020) 8:155987–6000. doi: 10.1109/ACCESS.2020.3018498

27. Pathak Y, Shukla PK, Arya KV. Deep bidirectional classification model for COVID-19 disease infected patients. *IEEE/ACM Trans Comput Biol Bioinformat.* (2020). 18:1234–41. doi: 10.1109/TCBB.2020.3009859

28. Dai H, Khalil EB, Zhang Y, Dilkina B, Song L. Learning combinatorial optimization algorithms over graphs. *arXiv [Preprint]* arXiv:1704.01665. (2017).

29. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, et al. Human-level control through deep reinforcement learning. *Nature*. (2015) 518:529–33. doi: 10.1038/nature14236

30. Wang B, Sun Y, Duong TQ, Nguyen LD, Hanzo L. Risk-aware identification of highly suspected covid-19 cases in social iot: a joint graph theory and reinforcement learning approach. *IEEE Access*. (2020) 8:115655–61. doi: 10.1109/ACCESS.2020.3003750

31. Iwendi C, Bashir AK, Peshkar A, Sujatha R, Chatterjee JM, Pasupuleti S, et al. COVID-19 patient health prediction using boosted random forest algorithm. *Front Public Health*. (2020) 8:357. doi: 10.3389/fpubh.2020.00357

32. Iwendi C, Moqurrab SA, Anjum A, Khan S, Mohan S, Srivastava G. N-Sanitization: a semantic privacy-preserving framework for unstructured medical datasets. *Comput Commun*. (2020) 161:160–71. doi: 10.1016/j.comcom.2020.07.032

33. Iwendi C, Srivastava G, Khan S, Reddy Maddikunta PK. Cyberbullying detection solutions based on deep learning architectures. *Multimedia Syst.* (2020). doi: 10.1007/s00530-020-00701-5. [Epub ahead of print].

34. Manoj M, Srivastava G, Somayaji SRK, Gadekallu TR, Maddikunta PKR, Bhattacharya S. An incentive based approach for COVID-19 planning using blockchain technology. In: *2020 IEEE Globecom Workshops*. GC Wkshps: IEEE (2020). p. 1–6. doi: 10.1109/GCWkshps50303.2020.9367469

35. Bhattacharya S, Maddikunta PKR, Pham QV, Gadekallu TR, Chowdhary CL, Alazab M, et al. Deep learning and medical image processing for coronavirus (COVID-19) pandemic: a survey. *Sustain Cities Soc*. (2021) 65:102589. doi: 10.1016/j.scs.2020.102589

Keywords: COVID-19, deep learning, LSTM, RNN, prediction reinforcement learning

Citation: Kumar RL, Khan F, Din S, Band SS, Mosavi A and Ibeke E (2021) Recurrent Neural Network and Reinforcement Learning Model for COVID-19 Prediction. *Front. Public Health* 9:744100. doi: 10.3389/fpubh.2021.744100

Received: 19 July 2021; Accepted: 02 September 2021;

Published: 04 October 2021.

Edited by:

Thippa Reddy Gadekallu, VIT University, IndiaReviewed by:

Joseph Henry Arinze Anajemba, Hohai University, ChinaMohammad Al Rawajbeh, Al-Zaytoonah University of Jordan, Jordan

Agbotiname Lucky Imoize, University of Lagos, Nigeria

Sweta Bhattacharya, VIT University, India

Copyright © 2021 Kumar, Khan, Din, Band, Mosavi and Ibeke. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Shahab S. Band, shamshirbands@yuntech.edu.tw; Amir Mosavi, amir.mosavi@mailbox.tu-dresden.de