RiskEstim: A Software Package to Quantify COVID-19 Importation Risk

We present an R package developed to quantify coronavirus disease 2019 (COVID-19) importation risk. Quantifying and visualizing the importation risk of COVID-19 from inbound travelers is urgent and imperative to trigger public health responses, especially in the early stages of the COVID-19 pandemic and emergence of new SARS-CoV-2 variants. We provide a general modeling framework to estimate COVID-19 importation risk using estimated pre-symptomatic prevalence of infection and air traffic data from the multi-origin places. We use Hong Kong as a case study to illustrate how our modeling framework can estimate the COVID-19 importation risk into Hong Kong from cities in Mainland China in real time. This R package can be used as a complementary component of the pandemic surveillance system to monitor spread in the next pandemic.


INTRODUCTION
The ongoing global pandemic of COVID-19 caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused incredible global disruption and challenges, in addition to the substantial health impact [1]. As of December 12, 2021, more than 269 million confirmed cases and 5.3 million deaths were reported worldwide [2].
Local outbreaks were often associated with the importation of infections. Quantifying and visualizing the importation risk of COVID-19 from inbound travelers is important for public health responses, especially in the early stage of an epidemic wave [3,4]. For example, some studies showed that border control measures, such as flight restrictions and quarantine for inbound travelers from high-risk places (e.g., based on the number of new daily cases [5]), might have delayed epidemics in the destination countries [6][7][8]. In addition, assessment of the COVID-19 importation risk is needed for places where a high level of population immunity to COVID-19 has not been achieved in the target populations [9] or the government is considering relaxing border control measures [10][11][12]. Here, we present the R package RiskEstim, the latest codebase version developed to quantify COVID-19 importation risk. First, we outline the general modeling framework of the R package to estimate COVID-19 importation risk using daily pre-symptomatic prevalence data from multi-origin locations and air traffic data.
Hong Kong started the alert against COVID-19 and screening of the travellers from mainland China at the very early beginning of the pandemic [13]. Due to the sound public health infrastructure and the in-time response, the information for the reported cases in Hong Kong was highly reliable, and most of the imported cases at that time originated from mainland China [14]. From the above considerations, we used Hong Kong as a case study to illustrate how our modeling framework can estimate the multi-origin COVID-19 importation risk in real time.

The Modeling Framework of Estimating Importation Risk in the RiskEstim
To quantify the importation risk of COVID-19 from the place of origin to the destination, we first estimated the daily presymptomatic prevalence of the COVID-19 in each origin place. Then we calculated the number of potential imported cases using the estimated daily pre-symptomatic prevalence of origin places and the daily origin-destination air traffic data. Next, we estimated the probability of importing at least one case as the indicator of importation risk to rank the origin places and visualize the risk maps. The modeling framework is shown in Figure 1.

User Input Data in the RiskEstim
Using Hong Kong as an example, we applied the R package to estimate the importation risk from 15 high-risk cities in Mainland China into Hong Kong in early 2020 [15]. Daily confirmed COVID-19 cases reported by the Chinese Center for Disease Control and Prevention (China CDC) from January 1, 2020, to February 29, 2020 were obtained for the analysis, [16][17][18]. Because Hubei Province changed the definition of cases on February 12, 2020, which yielded a dramatic increase in the number of cases on February 12, 2020 and February 13, 2020 (14840 on February 12, 2020 and 4823 on February 13, 2020) [19]. To reduce the reporting bias due to different case definitions for COVID-19 during the study period [20], we assumed the number of reported cases in Wuhan on February 12 was the same as those on February 11, and that for February 13 were the same as that on February 14.

Estimating Pre-Symptomatic Prevalence of COVID-19 in Origin Places
The daily prevalence of pre-symptomatic infections could be estimated with the Package based on the daily reported cases in the origin place(s) input by the user. Let ω o t be the number of reported cases in the origin place O on day t. Then on average the cases reported on day t developed symptoms on day t − T rep and were infected on day t − T rep − T inc , where T rep and T inc are the mean reporting delay and the median incubation period in days. Using this forward method, we estimated the daily numbers of infected individuals in the origin places.
In our case study of Hong Kong, we calculated the daily prevalence of pre-symptomatic COVID-19 in multiple origin places including cities from Hubei province and other provinces, and the estimates were consistent with the imported cases from these places during the early stage of the epidemic in Hong Kong [14]. Let y o t denote the place-specific presymptomatic prevalence of the place O on the day t, and H denote the cities in Hubei province. We used the median incubation period to denote the period where transmission would occur from infected cases. The place-specific presymptomatic prevalence is given by: where μ is the ascertainment rate ratio, representing the ascertainment rate of symptomatic cases in all non-Hubei provinces relative to Hubei province, which reflects the probability ratio of non-Hubei Provinces reporting a symptomatic case to Hubei Province [18]. I o d denotes the incidence of SARS-CoV-2 infection in an origin place on day d. The parameters are summarized in Table 1.

Estimating the Importation Risk
The place-specific importation risk of the destination was estimated based on [1]: daily pre-symptomatic prevalence of COVID-19 in origin places [2]; data on air passenger movements by place of origin and destination. Let Γ o,d t be the imported cases from the origin place O to the destination d on day t: | An illustration of the proposed framework to estimate the importation risk. This modeling framework includes three main modules [1]: the module of user input data is used to store data submitted by the users, such as daily reported cases and air travel flow [2]; the module of estimating the importation risk is used to estimate the importation risk of the target place based on the input data [3]; the visualization module is used to visualize output, such as the risk maps of the origin places, which could bring the importation risk to the destination.
Frontiers in Physics | www.frontiersin.org January 2022 | Volume 10 | Article 835992 where M o,d t represents the number of air passengers from origin place O to destination d on the day t, and α is the scaling factor adjusting for the impact on the force of importation from varied surveillance efficiency on COVID-19 in different places [18]. With the assumption that the number of imported cases per day followed the Poisson distribution, we evaluated the 95% confidence interval (CI) of the imported cases based on 100 simulations. Following the study of estimating the probability of cases imported [23,24], we estimated the cumulative importation risk Φ o,d t , which denotes the cumulative probability of importing at least one case from the origin place O to the destination d during the period T between t a and t b , given by:

RESULTS
In our case study, we used daily reported cases of COVID-19 from 15 Mainland China cities, which were previously identified by Lai et al. [15] as high-risk cities COVID-19 imports during January 2020, to estimate the daily pre-symptomatic prevalence of these cities (Figures 2A,B). Based on the daily presymptomatic prevalence of these cities and the data on air travel flows between these 15 higher-risk Mainland China cities and Hong Kong ( Figure 2C), we estimated the importation risk of Hong Kong (Figures 2D-F). The estimated number of imported cases from our model was 7.6 (95% CI: 5.0-12.1) from 15 higher-risk Mainland China cities into Hong Kong which was consistent with the reported 7 cases originating from Mainland China in Hong Kong before the Wuhan travel ban (January 23, 2020) [14,25]. The estimated probability of importation of at least one case indicated that Wuhan exported the highest number of cases (5.8, 95% CI: 4.6-7.1) into Hong Kong, followed by Shanghai (0.5, 95% CI: 0.2-0.9) and Beijing (0.5, 95% CI: 0.2-0.9), during the study period.

DISCUSSION AND CONCLUSION
This study aims to provide a general modeling framework to estimate COVID-19 importation risk. We illustrate the feasibility and reliability of the proposed framework with a case study which estimates the importation risk of COVID-19 to Hong Kong from multi-origin places using presymptomatic prevalence of infection and air traffic data. Notably, the method accommodates origin places where multiple variants circulate by estimating the importation risk of each variant separately then aggregating them in the destination places, given the availability of prevalence data and human movement data. The method implemented in this study is from a previous study [18] and the reliability of it is demonstrated in the case study of Hong Kong, while proposing a technically innovative method with competitive accuracy is not our major focus. At the current time, only a main method is supported in our modeling framework, while it can be extended by other well-designed and fine-calibrated methods in the future, such as [24,[26][27][28][29]. These analyses of the correlation between importation risk and population movement data, preparedness, and vulnerability at the destination, will be further complemented. This R package RiskEstim provides a general modeling framework to estimate the importation risk of infectious disease based on epidemiological and human movement data during an epidemic. The R package can be used as a complementary approach to the pandemic surveillance system to improve response to emerging SARS-CoV-2 variants and the next pandemic. In addition, the R package provides a modifiable codebase that can be extended to estimate the importation risk of other respiratory infectious diseases, such as influenza.

DATA AVAILABILITY STATEMENT
Our software package is developed in R, called RiskEstim. All code to perform the analyses and generate the figures in this study are available from the corresponding author upon reasonable request. Publicly available datasets were analyzed in this study. This data can be found here: https://doi.org/10.5281/zenodo. 4266642.

AUTHOR CONTRIBUTIONS
BC, EL, XX, and ZD were involved in the conceptualization and design of the study. MX and ZD designed the statistical methods, conducted analyses, wrote the manuscript, and MX developed the R package. EL, ZD, SS, YB, and PW reviewed and edited the draft.