Evidence That Higher Temperatures Are Associated With a Marginally Lower Incidence of COVID-19 Cases

Seasonal variations in COVID-19 incidence have been suggested as a potentially important factor in the future trajectory of the pandemic. Using global line-list data on COVID-19 cases reported until 17th of March 2020 and global gridded weather data, we assessed the effects of air temperature and relative humidity on the daily incidence of confirmed COVID-19 local cases at the subnational level (first-level administrative divisions). After adjusting for surveillance capacity and time since first imported case, average temperature had a statistically significant, negative association with COVID-19 incidence for temperatures of −15°C and above. However, temperature only explained a relatively modest amount of the total variation in COVID-19 cases. The effect of relative humidity was not statistically significant. These results suggest that warmer weather may modestly reduce the rate of spread of COVID-19, but anticipation of a substantial decline in transmission due to temperature alone with onset of summer in the northern hemisphere, or in tropical regions, is not warranted by these findings.


INTRODUCTION
Pandemic COVID-19, caused by a beta-coronavirus named SARS-CoV-2 first identified in Wuhan, China (1), has spread rapidly. This spread was pronounced in temperate regions of the northern hemisphere, coinciding with winter (2). The number of cases reported in countries in tropical regions is lower (2), with most low-and middle-income countries having weaker detection and response capacity (3). To date, spread of COVID-19 has been minimal in high income southern hemisphere countries such as Australia and New Zealand, which were in their summer season when the first cases were reported at the end of January and February, respectively (4,5). There has been much speculation about whether warmer temperatures are associated with decreased COVID-19 transmission, similar to what is observed for many viral respiratory infections (6). Higher temperatures were shown to have a protective effect against transmission of severe acute respiratory syndrome (SARS) in 2002-2003 (7), possibly due to the decreased survival of the SARS-CoV on surfaces at higher temperatures (8). Decreased aerosol spread at higher temperatures is another possible mechanism, as observed for human influenza viruses (9), though the role of aerosols in SARS-CoV-2 transmission remains unclear (10)(11)(12)(13).
Several studies have investigated the association between weather variation (principally temperature and humidity) and COVID-19 spread (14)(15)(16)(17)(18). However, there are several important limitations of studies published to date. Firstly, existing studies have not distinguished between imported and locally acquired infections. This is potentially a significant source of bias in existing studies, as imported infections are not related to weather conditions at the location at which they are detected. For example, 62.5% of COVID-19 cases in Australia (as of May 10th 2020) were acquired overseas (19), and the proportion was even higher earlier in the pandemic. Secondly, most studies have not taken variation in capacity to detect emerging infections into account-this is particularly relevant for interpreting data on the spread of COVID-19 in the first few weeks of the pandemic. Finally, no studies have conducted a global analysis using COVID-19 data consistently aggregated at subnational level, which reflects limitations of current COVID-19 reporting. For example, a recent global analysis (17) had COVID-19 data available at a mixture of city, province and country level. Country-level COVID-19 data was matched to weather data for the capital city, which masks significant weather variation that can occur within countries.
At present, consistent global datasets on COVID-19 cases, or the public health interventions implemented in response to COVID-19, are not available at subnational level. This significantly limits efforts to disentangle effects of weather variation from effects of public health interventions since widespread "lockdown" and other substantial control measures were initiated. However, detailed COVID-19 data from the first few weeks of the pandemic, prior to widespread implementation of interventions following the declaration of a pandemic, could be informative for understanding the association between COVID-19 and weather variation. A partially complete global open line list of all COVID-19 cases reported since the start of the pandemic, including detailed location and epidemiological information for each case, presents an opportunity for detailed analysis of COVID-19 and weather at subnational level (20). Therefore, this study aimed to analyze seasonal variation in COVID-19 at subnational level, taking limitations of existing studies into account.

Study Design
This population-based open cohort study investigated the effect of weather-related variables (air temperature and relative humidity) on daily COVID-19 case counts at the beginning of the pandemic. The daily case count was modeled at the level of the first-level administrative division (ADM1) in which they occurred, by constructing a daily time series of COVID-19 cases based on the date of case confirmation for each ADM1.

Setting and Participants
An open-source line list of confirmed COVID-19 cases was downloaded on March 18th 2020 (20). The line list included data on laboratory-confirmed cases from December 29th 2019 up to March 17th 2020 for all countries, including China. Cases included patients who had been admitted for treatment in hospitals and patients who did not require hospital admission. At that stage, the COVID-19 outbreak had just been declared a global pandemic by the World Health Organization (on March 11th 2020). Over 179,000 cases had been confirmed in 100 countries (21). Although community transmission was already confirmed in many countries of the Western Pacific and European regions, most countries only announced stringent national measures ("lockdowns") the week following the pandemic declaration, or later.
All ADM1 associated with at least one confirmed case of COVID-19 in the source dataset were included in the analysis, excluding Hubei province in China. Within these ADM1, all cases for which either a date of case confirmation or a date of onset of symptoms was available were included in the analysis.
Hubei province was excluded from the analysis as case reports of unusual pneumonia-like illness precede confirmation of the first confirmed COVID-19 case in Hubei province by several weeks (i.e., the observation period is incomplete), and case confirmation was likely substantially delayed or missed for many early cases. Further, it remains unknown whether a single or multiple spillover event(s) initiated transmission in Hubei. Ongoing animal to human transmission alongside human to human transmission may have occurred early in the outbreak, and it is unclear what the impact of weather conditions would have been on these spillover events. Last, widespread implementation of interventions started substantially earlier in Hubei than in the rest of China and the world.

Outcome Variable
We modeled the daily count of COVID-19 cases classified as local cases in each ADM1, from the date of first case report in the ADM1 to March 17th. Confirmed cases from the line list were classified as imported when travel history was reported in the associated data or as local otherwise.

Exposure Variables
We assumed that the weather variables would influence the transmission of SARS-CoV-2 at the time of infection. The dates of case confirmation were available, while the dates of infection were estimated as follows. The minimum time from infection to confirmation was estimated to be 3 days (22). The maximum time from infection to confirmation was estimated at 20 days, comprising an incubation period of up to 14 days (22) and the time to seek medical diagnosis and obtain a laboratory confirmation, which was estimated at up to 6 days (value estimated from the data). This value is close to the median of 7 days reported between the onset of symptoms and hospital admission reported in Wuhan (1). Therefore, the primary exposure variables were the mean air temperature and humidity at the ADM1 centroid between 3 and 20 days before the date of case confirmation. The temperature variable was included both as simple and squared terms to allow for nonlinear associations with the outcome. Due to model convergence issues, the humidity variable was only included as a simple term.

Potential Confounders
Four variables were included in the model as potential confounders: the time since the first reported case in the ADM1 (to account for right-censoring), the median age of the national population (United Nations database, https://ourworldindata. org/age-structure, to account for the higher incidence of severe cases in older people, which may be more readily detected), the population density in the ADM1 (Socioeconomic Data and Application Center, https://sedac.ciesin.columbia.edu) and the capacity of the country to detect an emerging infectious disease. The Global Health Security Index (GHSI) (https://www. ghsindex.org/) publishes a country-level score (out of 100) for capacity for "early detection and reporting for epidemics of potential concern." This indicator is a weighted average of indicators related to laboratory systems, real-time surveillance, and reporting, epidemiology workforce, and data integration between human, animal, and environmental health sectors.

Spatial Data Sources and Processing
Spatial data on ADM1 were obtained from the Global Administrative Areas dataset (https://gadm.org/, accessed March 4th 2020). This corresponds to the first-level administrative unit within each country, usually described as a state or province. The reported coordinates of each confirmed case (variably a point location, city centroid, or different subnational administrative levels) were used to determine the ADM1 in which the case occurred.

Weather Data Sources and Processing
Daily gridded temperature data at 0.5-degree spatial resolution were obtained from the Climate Prediction Centre (NOAA/ OAR/ESRL PSD, Boulder, Colorado, USA, https://www.esrl. noaa.gov/psd/data/gridded/data.cpc.globaltemp.html, accessed March 18th 2020). The daily temperature at the ADM1 centroid was calculated by taking the average of the maximum and minimum temperatures at the centroid coordinates for each day in the time series. Missing values for a given 0.5-degree cell and day were imputed from, by order of preference: the temperature in the neighboring spatial cells (Moore neighborhood) on the same day, the temperature for the previous or next day in the same cell, the relevant temperature from another dataset from the same source, the NCEP Daily Global Analyses. This dataset contains analyzed gridded temperature data at 2.5-degree spatial resolution (https://www.esrl.noaa.gov/psd/data/gridded/data. ncep.html, accessed March 18th 2020). Last, daily surface-level relative humidity data at 2.5-degree spatial resolution were obtained from the same dataset (NCEP Daily Global Analyses). The daily relative humidity at the ADM1 centroid was extracted for each day in the time series. The processing of weather data was performed in the R environment (23) using packages ncdf4 (24) and rgdal (25).

Statistical Methods
The statistical model was based on the generalized linear mixed effect regression framework, using a negative binomial distribution, implemented through the glmmTMB package (26). A zero-inflation component, with no predictor variables, was added to account for the large proportion of zero observed in the daily time series. Continuous variables were centered and scaled. The ADM1 location was included as a random effect (27). Initial data exploration indicated the presence of autocorrelation in the model results, that was adequately controlled for by adding an autoregression term of order 2. Diagnostic plots as well as model validation tests were obtained using the DHARMa package (28), to assess the distribution of predicted values, the presence of outliers, as well as residual dispersion and zero inflation. The bias-adjusted Akaike information criterion (AICc) was used to compare related models: the null model (no fixed effects, random effect only), a full model with all the variables described above, and three nested models obtained by removing the temperature and humidity variables, one at a time and together. The dataset and R script used for statistical modeling are provided as Supplementary Materials.

RESULTS
As of March 18th 2020, the line list contained detailed data on 26,032 cases, from which 25,861 cases had a valid confirmation date entry and were used for the analysis. One additional case only had the date of onset of symptoms, and its case confirmation was estimated to have occurred 6 days later, based on the mean delay observed in the data. A total of 407 ADM1 units worldwide reported at least one case and were included in the model. This included 30 provinces in China as well as 377 ADM1-level reports in 99 other countries (Figure 1). There were 2,322 daily, ADM1level observations with at least one reported case ( Table 1).
Model comparison showed that the full model and the model including the temperature variables and confounding variables only provided a similar fit to the data ( Table 2). Excluding the relative humidity variable did not significantly modify the AICc. However, excluding the temperature variables led to a substantial increase in AICc. The marginal pseudo R-squared was 21% for the full model, decreasing to 13% after removing the temperature effect.
The confounding variables corresponding to the population characteristics (median age and population density) were not significant predictors of the daily COVID-19 incidence ( Table 3). The early detection capacity of the country had a statistically significant, positive association with the outcome. The time since the first case confirmation in the ADM1 had a statistically significant, negative association with the outcome. Air temperature has a statistically significant quadratic association with the case incidence: an increase in air temperature was associated with a decreasing incidence for temperatures above −15 • C (Figure 2). The relative humidity had a negative association with the case incidence which was not statistically significant.

DISCUSSION
This study provides new evidence for the impact of weatherrelated parameters on the incidence of COVID-19 cases. There was a statistically significant effect of the average air temperature during the 3 preceding weeks on the COVID-19 case incidence in our study. However, the effect size was quite small, as shown by the pseudo R-squared estimates and changes in predicted values. The COVID-19 case incidence was negatively correlated with the air temperature for temperature above −15 • C. Notably, the  effect of relative humidity was not statistically significant. This study provides evidence that there may be seasonal variability in transmission of SARS-CoV-2, but this analysis does not imply that temperature alone is a primary driver of COVID-19 transmission. The observed association may not be due directly to temperature, but to correlated factors such as human behaviors during cold weather. Countries with higher early detection capacity had a higher reported case incidence. We suggest that this association is due to a detection bias, where countries with better disease detection capacity simply detect more cases. Current reports of the pandemic show that almost all countries across the globe have been affected by SARS-CoV-2, despite the large variance in their capacity to prevent, detect, and respond to disease outbreaks (29). However, we expected the opposite association, where countries with higher early detection capacity would have lower cases due to their ability to implement control measures earlier. Surprisingly, the association of the time since the first case confirmation in the ADM1 with the outcome was negative. We believe this is linked to considerable underreporting of cases in the global data source used for this study. Manual assessment of the time series showed that as the time since the first case increased, the number of cases reported for each ADM1 did not follow the expected exponential pattern. We suggest that this is due to the overwhelming number of cases confirmed as the outbreak becomes more severe, resulting in limited availability of information on individual cases after the initial stages. The two issues discussed here are common in epidemiological analyses based on reported cases. These results complement those of several recently published studies investigating the weather effect published for China (14)(15)(16), Brazil (30), Spain (31), and at a global level (17,18). There are also many related studies not yet peer-reviewed and available as pre-prints. The published studies for China and Brazil as well as one of the global studies showed a negative association between the air temperature and COVID-19 case or mortality incidence, using different lag periods (14-16, 18, 30). The two other studies did not find evidence of a relationship between COVID-19 cases and air temperature (17,31). Four studies showed a negative association between relative or absolute humidity and COVID-19 incidence (14,(16)(17)(18) while a fifth showed that an increase in relative humidity was associated with an increase in number of COVID-19 cases (15).
There are several strengths to this analysis, which add to the evidence base for an association between weather variation and COVID-19. Most importantly, this study made use of detailed line list data, which enabled the first global analysis of COVID-19 cases at province or state level, and for the categorization of COVID-19 cases as local or imported. The relevance of this potential bias is evident when considering countries such as Australia, where over 60% of COVID-19 cases to date were acquired overseas. Nonetheless, there are FIGURE 2 | Predicted daily number of local cases of COVID-19 by 1st-level administrative unit according to average air temperature (upper panel) and relative humidity (lower panel) from 3 to 20 days before case confirmation. The gray area represents the 95% prediction interval.
several important limitations to our analysis. The line list data used for this analysis were incomplete, compared to globally reported cases. Furthermore, despite using detailed case data, there was no consistent data available on many characteristics that affect the rate of spread within a region, especially the interventions initiated in response to the detection of imported or locally transmitted cases. Including data on implemented interventions to contain or mitigate COVID-19 in further analysis would provide additional insights into the effect of weather-related parameters.
Temperature and humidity have also been considered as factors influencing the spread of pandemic influenza and other respiratory tract viruses. Human pandemic influenza tends to show few seasonal trends upon emergence, while seasonal patterns appear during subsequent waves (32). These patterns have been linked with a more efficient transmission in cold and dry weather, in particular via aerosols (9,33). However, numerous other factors linked to the host, virus and environment are likely to play a role (34). Aerosol experiments on the 2009 H1N1 virus showed that the virus had a similar sensitivity to temperature and humidity as known seasonal influenza viruses (35). The authors suggested that the unusual timing of the H1N1 pandemic, with a high incidence in summer and autumn, may have been due to the lack of population immunity, which played a larger role in disease spread than temperature and humidity related factors. Our results regarding the effect of temperature on COVID-19 incidence are consistent with some of these characteristics. The possibility of similar recurrence and seasonality has been suggested for SARS-CoV-2 (36), though caution is warranted before extrapolating characteristics observed for pandemic influenza to pandemic COVID-19.

CONCLUSION
This study provides evidence of a modest association between warmer temperatures and lower COVID-19 incidence, for cases reported globally until March 17th 2020. Therefore, warmer weather may modestly reduce the rate of spread of COVID-19, but anticipation of a substantial decline in transmission due to temperature alone with onset of summer in the northern hemisphere, or in tropical regions, is not warranted by these findings.

ETHICS STATEMENT
The analysis made use of publicly available, anonymised data only. Therefore, no approval to conduct the study from an Ethical Review Board was sought. Nonetheless, the study was conducted in accordance with the Declaration of Helsinki, as revised in 2013.

AUTHOR CONTRIBUTIONS
AM, AC, CF, and MB-T contributed to the conception and design of the study. AM, AC, and CF processed the data. AM and RS performed the statistical analysis. AM and MB-T wrote the first draft of the manuscript. All authors contributed to the manuscript revision and read and approved the submitted version.