Responses to COVID-19 with probabilistic programming

The COVID-19 pandemic left its unique mark on the twenty-first century as one of the most significant disasters in history, triggering governments all over the world to respond with a wide range of interventions. However, these restrictions come with a substantial price tag. It is crucial for governments to form anti-virus strategies that balance the trade-off between protecting public health and minimizing the economic cost. This work proposes a probabilistic programming method to quantify the efficiency of major initial non-pharmaceutical interventions. We present a generative simulation model that accounts for the economic and human capital cost of adopting such strategies, and provide an end-to-end pipeline to simulate the virus spread and the incurred loss of various policy combinations. By investigating the national response in 10 countries covering four continents, we found that social distancing coupled with contact tracing is the most successful policy, reducing the virus transmission rate by 96% along with a 98% reduction in economic and human capital loss. Together with experimental results, we open-sourced a framework to test the efficacy of each policy combination.


. Introduction
The ongoing COVID-19 pandemic is one of the most challenging pandemics in human history, infecting more than 170 million people worldwide with more than 3.5 million fatalities as of May 30, 2021 (1). Rapid and easy transmission of COVID-19 leads to a high and fast-growing caseload, overwhelmingly straining the healthcare systems of many countries. Governments are pushed to apply prompt and effective interventions to protect public health. Such policies include lockdown, social distancing, contact tracing, hygiene, and mask mandates. However, countries differ on these measures and their stringency due to differences in public acceptance, the political climate, or government priority. Thus, many interventions were applied considering the individual socioeconomic status of countries. Furthermore, most countries lacked experience in handling the pandemic, only a handful have successfully brought the pandemic under control. The world has witnessed how the initial response to the virus dictated the trajectory of the virus spread.
Apart from its health impact, coronavirus has affected the economic state of the world with various restrictions imposed by governments to mitigate the virus spread.
The pandemic already caused a bigger recession than the Great Depression (2). As reported by Mandel et al. (3) lockdown generates more than a 33% drop in global output at its peak and more than a 9% drop in annual GDP. Furthermore, adverse economic effects of lockdown could even diffuse to the neighboring countries by supply chains (4). Thus, governments should carefully take economic context into account when making policy decisions.
This paper proposes a probabilistic programming method to evaluate the strategies imposed by different countries and point out which policies are the most successful in the initial response to the crisis. To provide insights on the effectiveness of initial responses to the pandemic, we analyzed data from 10 countries covering four continents. Moreover, we present a method to balance economic trade-offs of adopting specific policies by providing a generative model that considers the economic context of a given country. Given the recent focus on vaccination efforts, we also examine the effect of vaccination in the containment of the coronavirus in Israel and the United States.
To quantitatively express and analyze the success and failure of different countries, we utilize a probabilistic approach to FIGURE Project pipeline. First, we infer COVID-related parameters such as basic reproduction number R , incubation rate σ , recovery rate γ , and mortality rate µ using the compartmental model. Second, we apply the change-point model to infer policy e ciencies from di erent countries. Finally, using inferred parameters from previous steps and economic parameters from real-world data, we run the generative model in artificial country simulation to estimate the economic cost for di erent policy combinations.
tackle the COVID-19 transmission dynamics. As illustrated in Figure 1, our approach has three major components: 1. Infer COVID-19 statistics by the compartmental model (Section 4). 2. Estimate policy strengths by the change-point model (Section 5). 3. Simulate virus in the context of policy combinations considering the economic loss by the generative model (Section 6).
The compartmental model is to understand the representative statistics of the virus transmission dynamics, including recovery time, incubation time, reproduction number (R 0 ), and mortality rate. We infer the baseline statistics by fitting the SEIRD compartmental model on the Swedish data before the Swedish government imposed any policies. We assume that these statistics represent the original virus features unaffected by any human interventions.
With change-point models (5,6), we estimate the strength of the policies applied to curb the virus spread. Countries around the world impose various interventions with different degrees. Furthermore, the populations worldwide are largely not .
/fpubh. . homogeneous; therefore, the same policy could have different outcomes in different populations. Instead of capturing all these complicated factors, we choose cases where a particular policy can viably represent the upper bound of these policy's efficiency, i.e., the maximum reduction in infection rate. This provides a good idea of how effective each policy would be if applied in full force. To find these upper bounds, we investigate the countries with a successful initial response to the pandemic that stringently applied a given measure, such as China for lockdown or Singapore for social distancing. As these countries curbed the first wave of the spread by firmly applying a particular measure, we can consider the effect in these countries as the maximum effect the measure could perform. The measure will introduce an abrupt change in caseload growth with a significant drop in growth rate, starting from a change point in the timeline. We utilize the growth rate before and after this change-point to detect the effect of each measure. We run several experiments on countries with different initial responses to the pandemic. The efficiency of the initial policies is represented in terms of the transmission rate change after the policy establishment.
Inference result suggests that all major interventions are effective in reducing the virus spread. For example, contact tracing coupled with social distancing yields a 96% reduction in the virus transmission rate, achieving the same effect as lockdown and outperforming all other policies. The last part of our study includes simulating the virus in an imaginary country that follows all virus and policy statistics inferred from the previous parts. Our generative model is to support decision-makers to solve optimization problems having opposing objectives: public health and the economy. Stringent measures indeed incur economic collapse, but loosening the measures could lead to a devastating crisis. Therefore, the trade-off should be considered carefully. Our model predicts the trajectory of the pandemic, including cases, deaths, and recoveries. Moreover, we incorporate the economic cost into the simulation to address the economic trade-off of policy establishments. By controlling parameters, we estimate how the pandemic plays out in different scenarios and conclude which policy combination can effectively mitigate the virus in public health and economic dimensions. Simulation results suggest contact tracing coupled with social distancing incurs the lowest economic and human capital loss.
With all these analyses, we provide a simple but insightful model to analyze several features of a pandemic: severity of the disease, policy efficiency, and economic impact. This will help to understand the success and failure of each country in its response to the pandemic. It could be used as a playbook to better prepare for a possible pandemic in the future. For reproducibility, the code and datasets used in the paper are available at: https://git.io/JGcPW. The rest of this paper is organized as follows. In Section 2, we review related work. In Section 3, we introduce the dataset. In Sections 4 and 5, we propose our compartmental model and estimate policy strength by change-point model, respectively. With this model and economic viewpoint, we simulate an artificial country by changing policies in Section 6. We conclude the paper in Section 7.
. Related work . . Compartmental models Most of the epidemic models divide the target population into a certain number of compartments, consisting of individuals with identical statuses concerning a given disease. The foundations of the entire approach to epidemiology based on compartmental models were laid by public health physicians in the early 1900s. One of the first applications of the compartmental model was made by R. Ross, who demonstrated the dynamics of the transmission of malaria between mosquitoes and humans and consequently was awarded the Nobel Prize in Medicine in 1902 (7,8). Since then, compartmental models are still widely used to simulate the spread of a variety of infections (9).
One of the most popular extensions of the SIR model is the SEIR model (10), a traditional method used to simulate infectious disease that incubates inside the hosts for a while before the hosts become infectious. The SEIR model considers the incubation period by introducing a new compartment E (Exposed) to the compartmental system. This model and its modifications were already adapted to simulate the COVID-19 virus in many countries (11)(12)(13). In this work, we adopt a widelyused modification of the model-SEIRD (14) with the death compartment D. More recently, a SEIRD model with relaxed parameters has also been proposed to consider the rapidly changing social scenario arising from the period of the COVID-19 (15). Likewise, ameliorating compartment models according to the scenario would be a promising direction to study.

. . Probabilistic algorithms
The Markov chain Monte Carlo (MCMC) is a large class of sampling algorithms widely used for probabilistic problems. MCMC was first introduced in 1953 as a new method to simulate the distribution of states for the system of idealized molecules (16). However, the application of the algorithm did not limit itself to the physics field. It was later adapted and generalized by Hastings (17) to focus on statistical problems, opening its application to a wide range of domains. Due to its ability to handle complex types of analyses, the MCMC approach was widely used in finance (18,19), communication (20,21), computational biology (22), linguistics (23,24), and other fields with probabilistic settings. By no surprise, these methods are widely popular for estimating effects in complex epidemiological analyses as well (25)(26)(27). For example, Cauchemez et al. (28) has shown how to model influenza transmission using the Bayesian MCMC approach, and lots of variations of MCMC methods were used to infer features of an Ebola virus and analyze its transmission mechanism (29,30). Recent reports are also benefited from the Bayesian MCMC methods to infer COVID-19 virus transmission dynamics. Zhou et al. (31) implemented such inference based on a probabilistic compartmental model using daily confirmed COVID-19 cases and applied it to six states of the United States. MCMC algorithms are also successfully applied to changepoint models. The objective is to detect the abrupt property changes lying behind the time-series data (32). Recent work showed that MCMC algorithms with Bayesian parameter inference could be used to detect change-points in COVID-19 spread using SIR and SEIR epidemiological models of South Africa (33). According to their results, South Africa experienced two change-points: the first at the time of the national lockdown and the second after the massive screening and testing program. Dehning

. . Policy strength estimation
A wide range of work was done to estimate the efficiencies of the policies imposed by different countries to prevent COVID-19. Many of them were focused on the individual country cases considering their unique demographic features (35-37), while other reports compared many countries by the independent effects of a single category of policy (38)(39)(40). For instance, Iwata et al. (41) used Bayesian method analysis. They did not reveal the effectiveness of school closures that occurred in Japan in mitigating the risk of coronavirus infection in the nation. Another recent work by Sharov et al. (42) used a modified SIR model to compare the effectiveness of lockdown measures introduced during the coronavirus pandemic in 13 European countries, comparing them to two baseline countries (Sweden and Iceland) that did not implement the lockdown policies. For evaluation, this work used the herd immunity level and time of formation to indicate the effectiveness of lockdown measures (42). According to Sharov's results, lockdown and no-lockdown modes of containment led to roughly similar results.
There are also reports considering multiple policies across the globe (43,44). One example is work by Flaxman et al.
(45), which investigates effects of applied non-pharmaceutical interventions (NPIs) across 11 European countries for the period from the start of the COVID-19 epidemics. According to their results, major non-pharmaceutical interventions, like lockdowns, have had a significant effect on reducing the transmission of the virus. However, a subsequent study by Haug et al. (46), which assessed the efficacy of 6,068 NPIs across 226 countries and gave a detailed analysis of the country-specific "what-if " scenarios, showed different results. They analyzed the impact of government interventions on the effective reproduction number R t by combining several analytical approaches. By utilizing statistical, inference, and artificial intelligence tools, they concluded that combinations of some less disruptive and less costly NPIs could be as effective as more expensive and harsh ones like national lockdowns. Brauner et al. (47) came to the same conclusion by analyzing 41 countries during the first wave of the pandemic. According to their study, less harsh NPIs can be more effective in mitigating COVID-19 transmission than more strict stayat-home orders (47). Singh et al. (48) exploited the spatial and temporal variation in the introduction and lifting of non-pharmaceutical interventions (NPIs) across counties using a staggered difference-in-differences (DID) approach. They compared US counties with NPIs in place (treated) with counties that do not have NPIs in place (control) before and after implementation. Enabled by datasets with rich population characteristics, they stratified the datasets into several groups and analyzed the impact of implementing and lifting NPIs by the population groups they target. However, as we will discuss in Section 5, it takes a certain amount of lagging time to see the effects of NPI implementations. More meaningful analysis can be obtained with DID by considering the delay.
However, one of the significant limitations of recent studies is that none of them perform a comprehensive analysis considering the economic factors that affect the efficiencies of the policies. There were some reports regarding the economic cost of the pandemic situation across the globe (49). For example, McKibbin and Fernando (50) simulated a global economic model to explore seven scenarios, which differ in the proportion of the population who become infected or dead. According to their estimations, in a scenario where COVID-19 develops into a global pandemic, the cost of lost economic output begins to escalate into trillions of dollars (51). However, they do not include the effect of policy interventions in their simulations. To address this limitation, we propose another method of cost estimations by involving policy effects in Section 6.

. Dataset
For our analysis, we used virus data from two sources that are available online. The first dataset is taken from   . Analyze COVID-statistics by compartmental model In this section, we will introduce the proposed methodology to infer SARS-CoV-2 virus statistics. We will first present the SEIRD epidemiological compartmental model and the corresponding probabilistic programming model to infer several virus statistics. Then, we perform inference on the Sweden data to infer the important virus parameters .
This work does not involve any human participant, thus not subject to the IRB approval.

. . Compartmental SEIRD model
The most basic compartmental model, namely the SIR model, uses three compartments of Susceptible (S), Infectious (I), and Recovered (R). Each individual can move from a compartment to another compartment, resembling the progress of the disease. We could use S, I, and R to denote the number of individuals in their respective compartments. For the COVID-19 case, there is an incubation period in which people are infected but not yet infectious. Hence, we adapted the epidemiological SEIRD model (14) for our simulations, extending the SIR model with the E compartment of exposed individuals and the D compartment for deaths.
• Susceptible (S): Individuals in the compartment are neither infected nor immune to the diseases and, hence, could contract the disease. If the susceptible individuals contract the disease (via contact with an infectious individual), they progress to the Exposed compartment. • Exposed (E): Individuals in the compartment are infected but unable to pass the disease to susceptible individuals. If the Exposed individuals finish their incubation period and can infect others, they progress to the Infectious compartment. • Infected (I): Individuals in the compartment are infected and pass the disease to susceptible individuals. If the Infectious individuals recover from the disease and carry immunity or die from the disease, they progress to the Recovered (or Resistant) and Dead compartments, respectively. to any other compartment.
where the effective reproduction number (R e ) is the expected number of people that each infected individual can transmit the virus to during the outbreak, the basic reproduction number (R 0 ) is the natural reproduction number when there is no intervention, the incubation time (t E = 1 σ ) is the average time in which an individual is exposed but not yet infectious, the recovery time (t I = 1 γ ) is the average time after which an the infected case become concluded (recovered/dead), the case fatality proportion µ is the proportion of fatal cases among all concluded cases, and the waning time (t R = 1 α ) is the time that recovered individuals retain immunity. Since we deal with the initial stages of the pandemic, we assume people carry immunity to the disease upon recovery (α = 0) and the population stays constant over time and is equal to N.
Substantial amounts of COVID-19 cases are not reported due to testing availability and testing strategy. Hence, the exact total number of COVID-19 cases is unknown and typically not uniquely determined from the number of confirmed cases (52-54). Korolev (53) emphasizes that neglecting unreported cases leads to biased parameter estimation, so it is important for our model to address this issue.
To distinguish observable cases among all possible cases, we use a parameter called response rate ρ, which is the probability of a case being reported. At each timestamp (in our case, each day), if the SEIRD model estimates the number of new cases (transition from S to E) to be S2E and the number of recovered cases (transition from I to R) to be I2R, our model, the number of reported confirmed cases and recovered cases will be corrected by the response rate: Newly reported confirmed cases ∼ Binomial(S2E, ρ) Newly reported recovered cases ∼ Binomial(I2R, ρ), where Binomial(n, p) is the discrete probability distribution of the number of successes in a sequence of n experiments. Here, a reported case can be understood as a success, and each case can be understood as an experiment. Response rate ρ is inferred together as a parameter for our SEIRD model. Since it varies widely across countries, the estimated value is used internally within the country and is not generalized for others.

. . Scaling up with probabilistic programming
To implement the probabilistic models, we used the probabilistic programming language Pyro (55). For this particular inference task, we adopted Pyro's Epidemiology framework (56) for scaling up our experiments with a restricted class of stochastic discrete-time discrete-count compartmental models. This framework uses the Markov Chain Monte Carlo (MCMC) algorithm to fit the SEIRD model to infer COVID-19-related parameters: reproduction number R 0 , recovery time, incubation time, transmission rate, and mortality rate.
. . . Summary of the method MCMC is a stochastic algorithm that repeatedly generates random samples describing the distribution of parameters of interest (in our case, COVID-19 related parameters), where a new sample is generated based on the previous sample, thereby creating a Markov chain. The Markov chain has a stationary probability p S (x) such that if the chain ever arrives at p S (x), it will keep sampling from p S (x) forever. Therefore, the goal of MCMC is to design a transition probability to make the stationary distribution equate the target probability [i.e., p S (x) = p(x)]. Starting from an initial random sample, the algorithm guides the Markov chain to the stationary distribution, which we force to be the same as the target distribution (57).
A popular instance of the MCMC method is the Metropolis-Hastings algorithm that uses sampled proposal probability distribution (also called the kernel), followed by an acceptance criterion that chooses to accept or discard the new sample by comparing how likely the proposal distribution is to differ from the true next-state probability distribution. This criterion is implemented by an acceptance ratio, the probability for which we accept the new sample. If the proposal distribution is closer to the true distribution, we set a higher ratio to accept the new sample. For optimizing the sampling process, we used an instance of the Metropolis-Hastings algorithm, namely the Hamiltonian Monte Carlo (HMC) algorithm with the No-U-Turn Sampler (NUTS). The HMC algorithm avoids random walk behavior by taking steps informed by the firstorder gradient information (58). It utilizes an approximate Hamiltonian dynamics simulation, which is then corrected by a Metropolis acceptance step (59,60). HMC reduces the correlation between successive sampled states, allowing the algorithm to converge much faster with fewer Markov chain samples. However, since HMC is highly sensitive to two hyper-parameters: step size and the number of steps, the No-U-Turn Sampler (NUTS) is used to adaptively set these parameters (58). Thus, we can perform HMC without any manual hyperparameter tuning.
Get the acceptance ratio κ using HMC-NUTS algorithm: κ = HMC-NUTS(y,ŷ); Accept the sample i with probability κ and put it into the chain; If the number of divergent transitions (61) is zero: convergence = True → exit the loop; return Get the posterior distribution p( |y) from the chain and use its expected value E[ ] as the parameters of the SEIRD model f .
Algorithm . Estimating parameters of the SEIRD model using HMC-NUTS algorithm.

. . . Algorithmic presentation
From the observed COVID-19 trajectory (number of cumulative confirmed cases, recovered cases, fatalities over time), we apply HMC-NUTS to infer the right parameters to describe the development of the pandemic. Considering the pandemic follows the SEIRD model, Algorithm 1 describes how .
/fpubh. . According to model's estimations it takes around 16 days to recover after being infected, 5 days for symptoms to develop after the exposure and mortality rate is equal to 2.5%.
the HMC-NUTS algorithm estimates the model parameters.
Here, y = [y 1 , y 2 , . . . , y T ] is a pandemic trajectory with its statistics y t at time t.
. . Fitting SEIRD model to Sweden-The reference country We ran the model to estimate the parameters of the Swedish data before April 1st, 2020. We chose this early stage of the COVID-19 pandemic since Sweden did not impose any strict policies and aimed to achieve herd immunity (62). We assumed that the virus transmission rate was unaffected by any interventions, so we used Sweden as a baseline case and perform the experiments to infer unaffected COVID-19 parameters.
To run our probabilistic model, we set the prior according to the estimations of the World Health Organization (63). It was reported that mild cases typically recovered within two weeks, the incubation period was on average 5-6 days, and R 0 was typically around 2. Mortality and recovery rates differed depending on the region and stage of the virus spread, but the case-fatality rate was roughly 2.5%. Prior of the response rate ρ is given as Beta (10,10), which favors the initial value around 0.5, then converges in the range of 0 to 1.
The obtained posterior virus-related statistics for Swedish data are shown in Table 1. For more accurate results, we ran the model six times and reported the averaged values. The results are reasonable enough to use in our further simulations.
. Estimation of policy strength by the change-point model Change points in time series denote abrupt variations, and such changes represent transitions that occur between states (5). Change-point detection concerns whether or not a change has occurred or identifying the time of any such change. It is useful in modeling and predicting time series in diverse applications such as human activity analysis, speech and image analysis, medical monitoring, and anomaly detection (6).
This section introduces a change-point detection methodology to quantify the efficiency of the major interventions applied worldwide to mitigate the COVID-19 spread. First, we briefly introduce the concept of estimating the policy strength by referring to the compartmental model and its formula. Second, we describe a probabilistic programming model that detects change-points in the course of the caseload after the country applied NPIs. In the process of detecting change points, a probabilistic programming model can estimate the policy strength together. Next, we elaborate on several countries' data to conclude the efficiencies of investigated policies and give a summary of our findings in the final subsection. We will first explain how to measure the policy strength by referring to the SEIRD compartmental model (see Section 4.1) with some simple assumptions and new terms we describe below. The transmission rate β is the number of susceptible individuals that an infected individual can infect in a day, which is calculated as β = R e γ . At the very beginning, almost everyone in our setting is in the Susceptible compartment, so we can assume that S is equal to the total population size N, or S = N. With this approximation, Equation (1) can be rewritten as follows: Additionally, since the incubation period is much shorter than the recovery time, we can ignore the E, R, and D compartments at the initial stage of the simulation. Thus, we can approximate the total population size as N = S + I and Equation (3) becomes: . /fpubh. .
The size of the Infected compartment rises exponentially with the rate w = β (that is, on day t, the number of infected cases is calculated as e βt ). Due to the exponential nature, it is appropriate to investigate the case counts using a log scale. In the log scale, the exponential spread is represented as a linear line, with the transmission rate of the virus β represented by the slope w.
Consider an example in Figure 4. In the beginning, the graph is a steep line with the slope w 1 (representing a rapid, exponential spread). After a corresponding policy is applied, the graph bends and becomes less steep with the slope w 2 ≪ w 1 (slower spread). Therefore, the graph roughly consists of two lines of different slopes, w 1 and w 2 , with a separation point in-between, which we call a change-point (the black dotted vertical line in the graph). The slope w 1 and w 2 represent the transmission rates before and after the change-point. Since w 1 and w 2 represent the transmission rates before and after the policy takes its effect, we can define the strength of the target intervention in terms of their ratio: Given the incubation period, we expect the policy will show effect after around 2-4 weeks after the policy establishment.
In the next subsection, we will introduce a probabilistic programming approach to find the change-point when the policy takes effect.
. . Implementation: Change-point detection with policy strength estimation As was mentioned in the related work (see Section 2.2), probabilistic models were successfully used to detect changepoints in transmission rates of coronavirus. In the present subsection, we describe the probabilistic programming approach to detect change-points as well as estimating policy strength.

. . . Likelihood choice
In our probabilistic setting, the likelihood corresponds to the log-scaled line of accumulated confirmed cases. We chose piecewise linear regression and added the StudentT noise, which is more robust w.r.t the outliers than conventional Gaussian noise (67). We define τ as the change-point in the range [0, 1], with Frontiers in Public Health frontiersin.org . /fpubh. . 0 and 1 being the start and end of the simulation time period, respectively. The likelihood can be modeled as follows: where Note that the weights w 1 and w 2 correspond to slopes before and after the change-point in Equation (11) and Figure 4. To sum up, the change-point model is parameterized by six factors: w 1 , b 1 , w 2 , b 2 , τ , and σ .

. . . Prior choice
Here we illustrate the choice of parameters' priors used as input for our probabilistic model to draw samples from. For weights, we use the normal distribution, with w 2 having the mean equal to zero as we expect the slope to drop significantly after the change-point.
For bias terms, we set the priors to be from normal distributions. However, this time, we adjust the bias priors for each country adaptively since bias is sensitive to each country's course of the caseload. We assign the mean of y in the first and fourth quartiles to m 1 and m 2 , respectively. For b 1 to be relatively flat the standard deviation s 1 is set to 1 and s 2 is set to 0.25m 2 .
We use Beta distribution as a prior for the change-point τ and assume that the change is more likely to occur in the second half of the date range. We choose the parameter of the Beta distribution so that the peak of Beta(4, 3) falls to the 60th percentile of the date range.
The magnitude of the noise is quantified by the standard deviation σ . We put a simple uniform prior for σ .
Using the prior defined above and the actual case trajectory, we can finally estimate the parameters to measure the policy efficiency with the change point (see Algorithm 2).

. . Inferring maximum e ciencies of major policies with the change-point method
We investigate major initial interventions applied by several countries to mitigate the virus spread. For more accurate results, we chose nine countries presented in Table 2 that strongly imposed corresponding policies, assuming that they were applied to the fullest extend. By investigating the countries that applied a policy most stringently, we find a meaningful upper bound for each policy's efficacy. This upper bound is helpful for policymakers to determine the most appropriate intensity of the policy (more details in Section 6). In this experiment, we focused on five main policy categories: • Lockdown: A lockdown is an intervention that forces people to stay where they are. It includes a gathering ban, closure of non-critical services, and strict transportation restrictions. People cannot freely enter or exit their designated areas, and economic activities are essentially suspended. • Social distancing: Social distancing includes interventions or measures intended to maintain a physical distance between people, including a gathering limit or closure of non-essential services. It can be considered as a partial or a soft lockdown. • Contact tracing and social distancing: Contact tracing is the policy that investigates the close contacts of infected cases and then tests and quarantines them. Investigating the countries with successful contact tracing campaigns revealed that they coupled the contact tracing intervention with social distancing (e.g., South Korea, Australia, and Vietnam). For this reason, instead of addressing contact tracing separately, we merged it with social distancing to be closer to the real-world scenario. • Mask and hygiene mandate: Almost every country imposed a mask mandate sometime in their COVID-19 timeline response. Since it is always coupled with other restrictions, separating the effect of mask mandate from other interventions is a challenging task. Because the changepoint method cannot be applied in this case, we proposed a different approach to this issue, discussed in Section 5.4.4.  In extreme policies such as lockdowns, the effect comes quickly, but there are side effects to be described later in Section 6.3. campaigns can result in the same effects on virus mitigation as any other policy that the government may enact, we also analyze the effectiveness of the vaccination programs by applying the same change-point model.
By using the probabilistic programming model described in the previous section, we detected the amount of time the policy needed to take effect after establishment, the change-point, and policy strength for each case. Since we focused on the initial stage of the pandemic, the time frame we simulated is 3-6 months from the first recorded case, including the date that the policy is enforced and its effect could be seen. In all experiments, results have converged to the values consistent with our priors. The posteriors also fit well with the actual data ( Figures 5-12). The summary of policy efficiencies is shown in Table 2. In the next section, we discuss the results in more detail by investigating each policy separately by country.
. . Discussion of the results of the change-point method . . . Lockdown We investigate the lockdown interventions imposed in China and New Zealand. Both countries applied a strict lockdown as their initial strategy to combat the virus spread. The COVID-19 pandemic emerged in China, with the very first case confirmed on December 10, 2019 (70). New Zealand recorded its first case on February 28, 2020 (71).
Both countries experienced a swift reduction in infection after the application of lockdown. With a strong centralized government, China could force a lockdown from January 23, 2020, starting with the epicenter of Wuhan city and Hubei province. The lockdown was overwhelmingly stringent, with a travel ban, a stay-at-home order, and transportation suspension. Other Chinese cities quickly followed suit with similar measures.   Our model shows that the policy took its effect around February 8, 2020, with a 98% reduction in the transmission rate. New Zealand recorded its first case on February 28, 2020. The New Zealand government introduced a four-tier alert level system and imposed a lockdown on most of the country's population and economy from March 25, 2020 (71). The policy seemed to take effect around April 2, 2020, with a 95% reduction in the infection rate.
From these observations, we can conclude that a lockdown is capable of quickly curbing infections. We took the average efficacy of the two mentioned countries, 96% as the efficacy of the lockdown for our further experiments.

. . . Social distancing
We investigated the social distancing imposed in Canada and Singapore. Both countries applied social distancing or soft lockdown mandates in their initial strategy to combat the virus spread. The first COVID-19 case in Canada was confirmed on January 25, 2020 (72). Around March-April, 2020, the Canadian government started to apply several restrictions to maintain social distancing (72).
The first COVID-19 case in Singapore was confirmed on January 2, 2020 (76). The government introduced a soft lockdown (dubbed a circuit-breaker), which included a stayat-home order and cordon sanitaire . Contact tracing was not extensively utilized until a later stage of the pandemic (73, 79).
Both countries saw a considerable drop in the infection rate. Canada applied the restriction from around March to April, and the policy had an effect on around February 8, 2020, with a 70% reduction in the infection rate. Singapore applied the A cordon sanitaire is the restriction of movement of people into or out of a defined geographic area, such as a community, region, or country.

Frontiers in Public Health
frontiersin.org . /fpubh. .  circuit-breaker on April 7, 2020, and the change-point was determined to be on April 27, 2020, with a 78% reduction in the infection rate.
It is evident that social distancing had a considerable effect on reducing the infection rate. We took the average efficacy of the two mentioned countries, 74%, as the efficacy of social distancing.

. . . Contact tracing and social distancing
Australia and South Korea both utilized a contact tracing strategy coupling with social distancing as their initial strategy to combat the virus spread.
The first COVID-19 case in Australia was confirmed on January 25, 2020. On March 21, 2020, the Australian government imposed social distancing rules, with the closure of "nonessential" services. Swift recruitment of a large contact tracing workforce took place in March 2020 (74).
In South Korea, the first COVID-19 case was confirmed on January 20, 2020 (75). The government raised the alert level to "Serious" on February 25, 2020, announced guidelines to limit trips and outdoor activities and imposed emergency safety measures from basic hygiene rules to self-quarantine and social distancing (76). Health officials implemented extensive movement and contact tracing to identify and inform exposed individuals (76).
Both countries experienced a swift reduction in infection after applying their social distancing coupling with contact tracing. In Australia, the policy seemed to take effect after 10 days (around March 31, 2020), with a 96% reduction in the Frontiers in Public Health frontiersin.org . /fpubh. .  It is evident that social distancing when coupled with contact tracing can quickly curb the spread of infection. We take the average efficacy of the two mentioned countries, 96%, as the overall efficacy. Thus, contact tracing could push the efficiency of social distancing to the same level as the lockdown.

. . . E ect of mandating masks and hygiene
We could not use the Changing-point model for masks and hygiene because they are hard to separate from other policies. However, we can indirectly represent the transmission rate via effective reproduction numbers. The transmission rate is proportional to reproduction number: Therefore, we can use the reproduction number ratio.
We compared the effective reproduction number R e of Japan before policies were applied with the basic reproduction number R 0 of Sweden from Table 1, which is equal to 2.64.
The reason why we chose Japan lies in its cultural practices, which list the culture of wearing masks, very little physical Frontiers in Public Health frontiersin.org . /fpubh. . contact (such as hugging or shaking hands), and not wearing shoes in the house (80). We expect the reproduction number in Japan to be lower even if there is no strict policy applied. The average effective reproduction number R e for Japan after 6 runs was equal to 1.84. Thus, the efficiency of the hygiene and masks mandates is equal to 1 − 1.84/2.64 = 0.30.

. . . Vaccine
The US and Israel both have a sweeping and widespread vaccination program. The results obtained by our method for the US and Israel are plotted in Figures 11, 12, respectively. The US started the vaccine program on December 14, 2020 (77), while Israel started their campaign on December 20, 2020 (78). These two countries experienced a swift reduction in infection after the vaccine program started. In the US, the policy seemed to show effect around January 18, 2021, with a 73% reduction in the infection rate. In Israel, the policy produced effects on February 18, 2021, with an 88% reduction in transmission rate. According to our results, in both cases, vaccination successfully mitigated the virus spread.
However, most countries lack a swift and large-scale vaccination due to different reasons, including delay in vaccine production, financial difficulties, or vaccine hesitancy (81). Thus, in most countries, the fraction of the vaccinated population falls far below the herd immunity threshold according to the current data . The start of vaccination programs can also lead to some incautiousness and fatigue that may have already driven up cases in many countries like India and Thailand. From Figure 13, we can see that after the vaccination campaign started (82,83) https://ourworldindata.org/grapher/share-people-fullyvaccinated/~covid the number of cases increased drastically. It is possible that the reason for such an outcome lies in weakening awareness of coronavirus in the population after the vaccination campaigns start. People may have developed a more relaxed attitude toward restrictions, which consequently may have caused these spikes in confirmed cases. We conclude that large-scale campaigns and accountability of the population in vaccination establishment play a key role in its success.

. . . Policy overview
In Section 5, we evaluated the effectiveness of major policies based on the observed statistics. We found that social distancing, lockdown, and contact tracing are all effective in controlling the pandemic, with lockdown having the highest impact on the transmission rate (on average 96% efficacy for China and New Zealand). It was also found that a combination of social distancing with contact tracing was shown to have an effect comparable to the lockdown (also 96% efficacy for South Korea and Singapore). The policy usually takes effect from 8 to 20 days after enforcement.
We also estimated the vaccination campaigns' efficiency. We found that although in countries like Israel and the US, vaccination effectively mitigated the virus spread (on average 81% efficacy for Israel and the US), other countries like Thailand and India failed to bring virus spread under control. Moreover, it seems that vaccination programs were followed by a rapid increase in the confirmed case statistics in such countries. We suppose that reason for such controversial behavior lies in the lack of a large-scale vaccination program, as well as differences in public responsibility awareness.

. . . Limitations
This analysis was based on assumptions, where we ignore the inherent differences between countries and populations. Complex factors such as the acceptance or awareness of the general public could affect the policy's effectiveness. It is evident that some Asian countries tend to perform better in containing the diseases, which we attribute to the collectivist nature and (usually) centralized government. For example, countries with experience with previous epidemics (China and Vietnam with SARS and South Korea with MERS) also tended to perform better thanks to previous experience in handling similar outbreaks.
However, it is too early to conclude that stringent policies like lockdowns are the most successful at mitigating the COVID-19 pandemic since the side effect of applying the policy should also be considered. Considering that the most efficient policies by our estimations may not be the most effective ones in terms of economic cost, we conducted additional experiments to address this issue.

. . . Potential confounding factors
The decrease in caseload is most probably driven by the policy's effect, but it can be due to the shrink of the susceptible population (84). However, since we investigate the policy effect in the initial stage of the pandemic, we assume that the susceptible cases remain relatively stable. The change-point experiments are subjected to the time frame of January to May 2020, and the reported cumulative confirmed cases on May 31, 2020, were about 6 million, which is 0.1% of the global population. Phipps et al. (85) estimated that the number of actual cumulative cases could be 5-6 times larger than the reported number; it does not affect the number of susceptible of the initial stage of the pandemic. The change in transmission rate is principally driven by the policy efficiency, not because of the change of susceptible population.

. Simulation by generative model
Having the virus statistics and policy capacities, we are ready to run our simulation experiments. Our pipeline is flexible enough to handle simulations with different sets of parameters. Since all variables are already inferred, we can use a simple generative model to predict how pandemic plays out in different scenarios. To address the trade-off between public health protection and economic loss, we estimate the cost of the policies and the total loss for given caseloads and death tolls. We tried out multiple policy combinations to figure out what might be the best policy to fight the pandemic in our experimental setting.

. . Model
To simulate the infection and fatality cases, we used the SEIRD (see Equation 5). We follow the differential equations Equations (1)-(6) and the virus and policy statistic derived in Sections 4, 5. However, the fatality rate will not stay constant as we considered the hospital capacity.
Apart from the parameters related to virus statistics and policy efficiency, we also need the input of the economic effect of the policies as well as the hospital statistics (hospital capacity, percentage of cases requiring . /fpubh. . hospital admission, and the death rates with and without hospital treatment). Our model gives flexibility to each country to input its own parameters into the model.

. . Assumption
We illustrate the operation of our model on an imaginary country with a population of 1 million. In addition to the parameters we inferred from previous results, we applied some additional assumptions: • The hospital capacity is 60 per 100,000 capita (0.06%).
Among OECD countries, the number of critical care beds ranges from 3.3 to 33.9 beds per 100,000 capita (86), and the number of hospital beds ranges from 50 to 1,300 beds per 100,000 capita (87). Since countries might adapt normal beds into critical care beds to treat COVID-19 patients amid the health crisis, we use 60 critical care beds per 100,000 capita, which is already double the figure for the most resourceful country (33.9 for Germany). • 6% of the total cases are required to stay in the intensive care units (ICU). Preliminary data on a subset of 7,162 COVID-19 patients age 19 and older with known health history in the US, from November 12, 2020, to March 28, 2020, found that 6% requires ICU treatment .
• ICU-required cases will die without ICU treatment. With treatment, the death rate for cases admitted to ICU is 60%. Data from Washington, Seattle, and California suggests mortality rates reported in patients with severe COVID-19 in the ICU range from 50 to 65% (88).
We also estimate the economic and human capital cost for each policy:

. . Lockdown only delays the virus spread
We ran the model without any policy and with the lockdown applied from day 30 to 60. As you can see from Figure 14, applying lockdown for 1 month simply postpones the virus spread. Another problem is that it significantly hits the country's economy, so it cannot be applied for a long time. Thus, even though lockdown is estimated to have the highest efficiency of 0.96, it might not be the best policy to apply. So further experiments are required to identify how, when, and for how long the policies should be applied.
. . Best initial response: Social distancing with contact tracing Finally, we want to devise the best initial response to the virus. Using the inferred statistics, we conducted experiments on the policies and performed simulations to develop the optimal policy with minimal loss (both economic loss and life loss). We designed an imaginary country with a population of 1 million and a GDP of $30,000 per capita. The country had a population of 1 million, and the simulation spanned three months. We assume that policy-makers revised the policy every month, and a policy is applied exhaustively, partially (50% efficacy), or not applied at all. Policies could be applied together. The goal was to minimize the cost.

FIGURE
A lockdown delays the virus spread, but cannot prevent it. In this simulation, the lockdown policy was imposed from day to day (total month). The gray graph represents the baseline situation with no policy applied, the green graph indicates the case when a lockdown is applied. The blue and red lines mark the start and end of the lockdown, respectively. We can see that although lockdown sharply decreases the number of exposed and infected cases, it cannot prevent the virus from spreading after the lockdown is lifted on day . The number of exposed and infected cases rises again. Given the cost of the lockdown, it is impossible to maintain it for long periods of time, making it less preferable to less costly policies. Thus, the conclusion is in alignment with those of prior works ( , ). (A) Susceptible cases, (B) Exposed cases, (C) Infected cases, (D) Recovered cases, (E) Death cases.

. . . Results
Then results after three months for some important policy combinations are shown in Table 3. The full loss trajectories of important policies are shown in Figure 15.
The best policy identified so far is contact tracing with social distancing, with a loss of around 2 billion dollars. Without intervention, the loss in the imaginary nation is $197.9 billion.
Scaling up to the US population, the simulated economic loss reaches $65 trillion which is nearly equivalent to the World GDP of $85 trillion, implying that intervention must be enforced in the initial stage of the pandemic.
Generally, social distancing coupling with contact tracing incurs less loss than lockdown or social distancing. They are all strong interventions compared to masks and hygiene mandates.   The (daily) accumulated loss incurred in each intervention. Social distancing coupled with contact tracing incurred the least loss, followed by lockdown, social distancing, masks and hygiene mandates, and no policy incurring the biggest loss.
However, they all significantly save a tremendous loss compared to doing nothing or only doing the mask and hygiene mandates. The human cost for mild intervention seems to be significantly lower than the economic cost of the strong intervention. Nonetheless, the masks and hygiene mandates still halved the loss that we suffer when we do nothing. Contact tracing coupled with social distancing reduces the economic and human capital loss by 98% compared to doing nothing. Although as efficient as lockdown (Section 5), the economic and human capital costs are at least 8% less in a 3 month period. The optimal policy in our setting is contact tracing and social distancing for three months with additional hygiene and masks mandates for the first month. Hygiene and masks mandates play some role in minimizing the loss, albeit the improvement is marginal.
Therefore, we can conclude that quarantine and contact tracing are the most efficient policies in our setting. Indeed, we can see that countries that enjoyed the initial success in controlling the virus cases, e.g., South Korea, Vietnam, or Australia applied social distancing and contact tracing as their primary policies.

. . Limitations and implications
Due to the challenge of separating the policy effects, our study has some limitations within which our findings need to be interpreted carefully. First, we have investigated major policies applied in relatively wealthy countries. For economic loss estimations, we utilized policies' costs reported from various developed countries using different sources. Depending on the parameters adjusted for a particular country, the results might be different. Second, the estimated efficiency of each policy in Section 5 is measured on the most successful cases. Although it provides a meaningful upper bound for each policy's efficiency, we cannot simply assume that every country can achieve this maximum efficiency. Therefore, the "best" initial response in Section 6.4 should be understood under the context that every policy can be feasibly applied to its fullest extent. Third, future work can utilize our model and extend it by considering the confounding effects of other interventions or changes in the susceptible population size. Such a model can be used to estimate the efficiencies of the policies applied in the later stages of the pandemic.
Given the unique socioeconomic state, each country has its own feasible stringency and the price tag for each measure. Many factors contribute to this variability, such as public acceptance, political climate, and government priority. To cope with that, we provide a flexible end-to-end pipeline, which can be tailored to each country's specific needs. Decision-makers can adjust the corresponding parameters or apply their country's cost estimation to adapt to their own situation. They can also exclude infeasible policy settings in their country when running the simulation. The resilience to control model's parameters allows countries to see how the pandemic will play out under different scenarios and build their own strategies based on the model's output.

. Conclusion
Recent research on COVID-19 propagation analysis has provided a deeper understanding of the transmission processes occurring during the past 1.5 years. Epidemiological models point out the key factors that affect the spread of the virus, including the basic reproduction number, virus incubation period, and daily infection number. In the present study, we have moved one step further to gauge the efficacy of the earlystage policy to respond to the pandemic, with economic factors related to the policy itself and its benefits of slowing down the virus. Detailed analysis from 10 countries suggests that social distancing, coupled with contact tracing, is the most . /fpubh. . efficient policy among major initial interventions. From the data of Asian countries, we derive meaningful results that close contact tracing could provide protection to citizens from the pandemic comparable to lockdowns, without inducing as much cost. Going one step further, we carefully designed a simulated country and gauged the efficacy of each policy combination. Our testbed allows end-users to control various parameters suitable for their country's situation. Through the process of overcoming COVID-19, we are gaining a clearer understanding of the trade-off between virus prevention and economic loss. As we have seen in many countries, it is crucial to identify each policy's efficiencies and costs and to estimate the best time and intensity to impose them before it is too late. We hope that our research will assist every nation in responding to possible future pandemics.

Data availability statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.