Examining Social Vulnerability and the Association With COVID-19 Incidence in Harris County, Texas

Studies have investigated the association between social vulnerability and SARS-CoV-2 incidence. However, few studies have examined small geographic units such as census tracts, examined geographic regions with large numbers of Hispanic and Black populations, controlled for testing rates, and incorporated stay-at-home measures into their analyses. Understanding the relationship between social vulnerability and SARS-CoV-2 incidence is critical to understanding the interplay between social determinants and implementing risk mitigation guidelines to curtail the spread of infectious diseases. The objective of this study was to examine the relationship between CDC's Social Vulnerability Index (SVI) and SARS-CoV-2 incidence while controlling for testing rates and the proportion of those who stayed completely at home among 783 Harris County, Texas census tracts. SARS-CoV-2 incidence data were collected between May 15 and October 1, 2020. The SVI and its themes were the primary exposures. Median percent time at home was used as a covariate to measure the effect of staying at home on the association between social vulnerability and SARS-CoV-2 incidence. Data were analyzed using Kruskal Wallis and negative binomial regressions (NBR) controlling for testing rates and staying at home. Results showed that a unit increase in the SVI score and the SVI themes were associated with significant increases in SARS-CoV-2 incidence. The incidence risk ratio (IRR) was 1.090 (95% CI, 1.082, 1.098) for the overall SVI; 1.107 (95% CI, 1.098, 1.115) for minority status/language; 1.090 (95% CI, 1.083, 1.098) for socioeconomic; 1.060 (95% CI, 1.050, 1.071) for household composition/disability, and 1.057 (95% CI, 1.047, 1.066) for housing type/transportation. When controlling for stay-at-home, the association between SVI themes and SARS-CoV-2 incidence remained significant. In the NBR model that included all four SVI themes, only the socioeconomic and minority status/language themes remained significantly associated with SARS-CoV-2 incidence. Community-level infections were not explained by a communities' inability to stay at home. These findings suggest that community-level social vulnerability, such as socioeconomic status, language barriers, use of public transportation, and housing density may play a role in the risk of SARS-CoV-2 infection regardless of the ability of some communities to stay at home because of the need to work or other reasons.


INTRODUCTION
The United States (US) has been severely affected by SARS-CoV-2, but cases in the US are not evenly distributed across the population (1). Emerging studies suggest that social vulnerability and inability to stay at home play a major role in increased cases of SARS-CoV-2 within a community. However, few studies have assessed these factors simultaneously (2)(3)(4)(5)(6)(7)(8)(9). Social vulnerability is the degree to which a community exhibits social conditions that may affect their ability to prevent serious injury, illness, or loss in the event of a disaster (10). CDC created the Social Vulnerability Index (SVI) for local public health organizations to assess and prioritize census tracts that may be particularly vulnerable to disasters such as a pandemic (10). Though CDC's SVI was not constructed with a pandemic in mind, the SVI can be very useful in identifying vulnerable populations, poverty, and living conditions which may make populations more susceptible to communicable diseases such as SARS-CoV-2 (10). Sheltering in place has been a primary prevention strategy to mitigate the spread of SARS-CoV-2. Research shows that staying at home is one of the most effective strategies in controlling the pandemic (11,12). However, disadvantaged communities may be less likely to be able to stay at home because of financial constraints; the inability to work from home compared to higher wage-earners; being a member of the essential workforce; having fewer savings, and having to work or risk losing income (11,12).
A growing number of studies have investigated the association between social vulnerability and SARS-CoV-2 incidence (13)(14)(15)(16). However, these studies have four important limitations: (1) few studies examine smaller geographic units such as census tracts; (2) few studies examine geographic regions with large numbers of Hispanic/Latino, Black, and Asian populations; (3) few studies control for testing rates; and, (4) even fewer incorporate stay-athome measures into the analysis. Controlling for stay-at-home measures can provide a more precise picture of how social vulnerability and staying at home impact SARS-CoV-2 incidence (4,9). Our study objective is to overcome these limitations and extend this work by examining the relationship between social vulnerability, proportion of the population staying at home, and SARS-CoV-2 infection rates in Harris County, Texas, by census tract.

Setting
This study analyzed SARS-CoV-2 incidence data reported in 783 census tracts in Harris County, Texas. Harris County has over 4.7 million people and is one of the most diverse regions in the country, with 44% of the population identifying as Hispanic or Latino, 29% identifying as White, 20% identifying as Black, and 7% identifying as Asian (17). The County has a median annual income of $61,705 and a poverty level of 15%, higher than the national average of 13% (17).

Outcome: SARS-CoV-2 Incidence
For this ecological study, census tract-level SARS-CoV-2 incidence reported to Harris County Public Health Department and the City of Houston Public Health Department from May 15 to October 1, 2020, was analyzed. The date of diagnosis of a SARS-CoV-2 case was defined as the first positive test result. The dependent variable, SARS-CoV-2 incidence, was calculated as both a case count and a rate per 1,00,000 using the census tract population estimated by the US Census 2019 American Community Survey 5-year estimates. SARS-CoV-2 incidence was calculated over the entire time period.

Exposure: Social Vulnerability Index
Independent variables consisted of an overall social vulnerability index and four social vulnerability themes. The social vulnerability index used in this study was acquired through publicly-available data from the CDC's 2018 Social Vulnerability Index (SVI) (10). The CDC uses US Census data to determine the social vulnerability of every census tract. The SVI ranks each census tract on 15 social factors, including poverty, lack of vehicle access, and crowded housing, and groups them into four indices (themes): socioeconomic status, household composition/disability, minority status/language, and housing type/transportation. CDC offers the SVI for the United States and for each individual state. For state data, census tracts are ranked against other tracts within the state. Texas rankings were used for these analyses (10). CDC calculates the SVI by summing scores and converting these scores into a percentile rank ranging from 0 to 1, with higher scores denoting greater vulnerability (10). SVI measures were initially described by quintiles and then analyzed as a continuous variable in the negative binomial regression models. The SVI index was rescaled to an integer value by multiplying the SVI percentage by 10 so that the index ranged from 0 to 10 for ease of interpretation. A 1-unit change in the results refers to a decile (0.1) increase on the original scale.

Covariates
The percent of individuals fully staying-at-home was acquired from SafeGraph, a company that aggregates anonymized smartphone location data in the United States. SafeGraph compiles individuals' cell phone data into aggregate measures on the census tract-level over time and then calculates the proportion of smartphone users who spent all day at home for each date based on inferring the user's overnight location during the previous six weeks (18). Proportion staying-at-home was calculated by cumulatively averaging the measures over the study time by census tract. The percent time at home measure was initially described and mapped by quintiles and analyzed as continuous variables in the models (19). These data correlates well with other smartphone location data, with the Gallup survey data on staying-at-home measures, and workplace visits reported by Google COVID-19 Community Mobility Reports (20).
Population estimates were acquired from the US Census 2019 American Community Survey 5-year estimates. The testing data was acquired from the Harris County Public Health Department and the City of Houston Public Health Department. All verified tests were used regardless of type (molecular or antigen). Testing rates were calculated per census tract by dividing the number of tests from May 15 to October 1, 2020 by the population within a census tract. The distribution of the testing rate was examined and subsequently transformed by taking the log.

Analysis
This analysis included 783 census tracts in Harris County. Descriptive statistics of all variables were calculated. A bivariate map of SARS-CoV-2 incidence and the overall SVI by census tract was created to visualize the association between these two variables. A Kruskal-Wallis test was conducted to examine differences in incidence rates among the overall SVI quintiles, SVI theme quintiles, and the percent time at home measure quintiles. Correlations were performed for the association between all variables. Testing rates by census tract were examined and the association between testing rates, SVI, and SARS-CoV-2 incidence were analyzed. A negative binomial regression (NBR) model was used to analyze variations in SARS-CoV-2 incidence counts across census tracts. This model was used to handle the overdispersion of count-based data containing only nonnegative integer values and consider the independent explanatory neighborhood-level characteristics (15,21). Since case count data was used instead of incident rates, all models controlled for the population size. The log of population size and testing rate were used. Multivariable negative binomial regressions were used to examine the relationship between each individual SVI theme and SARS-CoV-2 incidence counts. The analysis plan was approached in steps. First, NBR models were conducted for each SVI theme controlling for the log of population and log of testing rates. Next, NBR models were performed for each SVI theme independently while controlling for the log of population, log of testing rates, and median percent time fully at home. The final NBR model included all four SVI themes controlling for the log of population, log of testing rates, and median percent time fully at home. The incidence risk ratio was calculated by exponentiating the regression coefficient and is interpreted as an increase or decrease in the risk of SARS-CoV-2 incidence associated with a one unit change in the independent variable (21). All models examined SARS-CoV-2 incidence between May 15 and October 1, 2020; this time period represents the second wave of the pandemic in Harris County. Significance testing is at p < 0.05. Data were analyzed using STATA 15.0 statistical software (StataCorp LLC) and ArcGISPro 2.8 (Environmental Systems Research Institute, Inc., California, USA). This study was reviewed by The University of Texas Health Science Center at Houston Committee for the Protection of Human Subjects.

RESULTS
Between May 15 and October 1, 2020, there were 140,853 incident cases of SARS-CoV-2 within the Harris County Public Health Department and City of Houston Health Department jurisdictions. Testing rates by census tract ranged from 1,535.433 to 26,105.210 per 100,000 population in the studied census tracts ( Table 1). The incidence of SARS-CoV-2 ranged from 433 to 9,791 per 100,000 population in the studied census tracts. The overall SVI, the socioeconomic status theme, the minority status/language theme, and the housing type/transportation theme were significantly associated with the percent-at-home measure. However, the SVI measures explained only a small amount of variance in the percent staying-at-home measure. Figure 1 displays a bivariate map of SARS-CoV-2 incidence and the overall SVI by census tract. A non-random pattern of increased SARS-CoV-2 incidence was observed in Harris County census tracts. Table 2 displays the distribution of SARS-CoV-2 incidence rates among SVI quintiles, and the median percent time completely at home quintiles. There were significant differences in the incidence rates of SARS-CoV-2 by quintiles for the overall SVI and its themes, with higher incidence rates for more vulnerable census tracts. The initial NBR models of the social vulnerability measures indicated that all SVI themes were significantly associated with cumulative SARS-CoV-2 incidence after controlling for population and testing rates ( Table 3). For every unit increase in the overall SVI index, the incidence of SARS-CoV-2 increased 1.090 times (95% CI, 1.082, 1.098). The socioeconomic status theme was also associated with a 1.090 times (95% CI, 1.083, 1.098) increase in the incidence of SARS-CoV-2 for every unit increase. Similarly, the minority status/language theme was associated with a 1.107 (95% CI, 1.098, 1.115) increase in the incidence of SARS-CoV-2 for every unit increase. The household composition/disability theme, and the housing type/transportation theme were associated with ∼1.060 times and 1.057 times increase, respectively, in the incidence of SARS-CoV-2 for every unit increase. After adding the staying-at-home measure to each model, the association between the SVI themes and SARS-CoV-2 incidence did not significantly change.
In the multivariable model ( Table 4) after controlling for testing rate, the staying-at-home measure, and entering all SVI themes, the socioeconomic theme and the minority status/language theme remained significantly associated with SARS-CoV-2 incidence. The minority status/language theme was strongly associated with SARS-CoV-2 incidence with a 1.078 (95% CI, 1.063, 1.093) increase in the incidence of SARS-CoV-2 for every unit increase. The household composition/disability theme and the housing type/transportation theme were no longer significantly associated with SARS-CoV-2 incidence after taking into account the other SVI themes, testing rates, and the stayingat-home measure.

DISCUSSION
In this ecological study of Harris County census tracts between May 15 and October 1, 2020, we found that CDC's social vulnerability index and all four SVI themes were positively associated with SARS-CoV-2 incidence, even after controlling for testing rates and the stay-at-home measure. The magnitude of the associations of all four themes was substantial and remained positively associated with SARS-CoV-2 incidence, even after controlling for testing rates and the stay-at-home measure. After controlling for testing rates, the staying-athome measure, and all SVI domains simultaneously, the socioeconomic theme and the minority status/language theme remained strongly associated with SARS-CoV-2 incidence, while the household composition/disability theme and the housing type/transportation theme did not.
This study adds to the growing body of literature seeking to understand community-level social determinants associated with SARS CoV-2 incidence. Other studies using CDC's SVI at both the census tract and county-levels had similar findings. Studies of US counties have found significant associations between SVI and SARS-CoV-2 incidence rates (14,16). In a study that analyzed census tracts in Louisiana, Biggs and colleagues found a 52% increase (IRR, 1.52; 95% CI, 1.40-1.65) in SARS CoV-2 incidence per 0.01 increase in the overall SVI and a significant association for each of the SVI domains (13).
Socioeconomic inequalities have historically been strongly associated with disease incidence during pandemics, such as the Spanish influenza pandemic of 1918-1919 (22). The association between the socioeconomic status and SARS CoV-2 incidence in the present study is consistent with previous studies (13,14,16). For example, Karmakar and colleagues found an 11% increase in SARS CoV-2 incidence for every 0.1 increase in incidence for every 0.01 increase in the socioeconomic status theme among Louisiana census tracts (13). These findings are consistent with the present study.
Our study found significant associations between housing type/transportation and SARS CoV-2 incidence. Studies using data on both the county and census tract levels have found significant yet small associations between housing type/transportation and SARS CoV-2 incidence (13,14,16). These findings may suggest that housing type and transportation are important factors that increase one's risk of SARS CoV-2 exposure, but that other SVI factors are more influential to one's actual risk.
The SVI theme that was most strongly associated with SARS CoV-2 incidence in this study was minority status/language. This study's findings on the strong association between the minority status/language theme and SARS CoV-2 incidence are consistent with other studies. A study of US counties found that for every 0.1 increase in the minority status/language domain was associated with a 21.7% increase in SARS-CoV-2 incidence rate (IRR, 1.22; 95% CI, 1.20-1.23; P < 0.001) (14). Similarly, another study found that those in the most vulnerable counties for the minority status/language domain had a 4.94-fold greater risk (IRR, 4.94; 95% CI, 3.91-6.24) of SARS CoV-2 incidence (16). The strong association between the minority status/language domain and SARS CoV-2 incidence could possibly be explained through multiple routes but needs further exploration (23). Racial/ethnic disparities are often explained by socioeconomic, educational, systemic racism, structural inequalities, and housing differences. Some researchers have suggested that racial/ethnic disparities in SARS CoV-2 incidence may be related to the racial distribution of work in essential industries and the inability for these workers to social distance or stay at home (24). Though findings suggest that racial/ethnic minority populations have a greater risk for SARS CoV-2 infection, this is not due to some inherent biological or genetic predisposition. Research has shown that these constructs are not biologically based and are primarily social constructs (25). This study suggests that racial and ethnic minority populations may have greater risk for SARS CoV-2 infection even after controlling for other social vulnerabilities, testing rates, and the stay-at-home measure. More research is needed to elucidate how minority status and language affects SARS CoV-2 incidence and what other factors are driving these associations.

Strengths and Limitations
This study has several strengths. The study used secondary data of SARS-CoV-2 incidence, the stay-at-home measure, and social vulnerability. This research contributes to the growing body of evidence on the usefulness of the Social Vulnerability Index and infectious disease outbreaks. Moreover, using anonymized cell phone data to inform public health officials may be a novel approach for emergency response. Few studies have assessed disparities and SARS-CoV-2 in populations with cultural and economic diversity, such as Harris County, and few studies have assessed these relationships at the census tract level.
This study also has several limitations. The SARS-CoV-2 data is from positive cases captured through testing data and does not represent all positive cases in Harris County. The study  design is ecological, cannot determine causality or temporality, and does not have a comparison group. Ecological studies such as this study can be prone to ecological fallacy because individuals will be aggregated to the group level and risk factors cannot be linked directly to the outcome. The stay-at-home data may not represent the general population because smartphone use may vary across socio-demographic groups. The stay-at-home data is approximated from smartphone location data and is most likely missing not at random. Though missingness may not be random, the data provider corrects for this (18). Additionally, there may be limitations in the accuracy and precision of the measurements used.

Public Health Implications
This study found that the CDC's SVI to be significantly associated with SARS-CoV-2 disease incidence even after controlling for testing rate and the stay-at-home measure. By further developing our understanding of the community-level factors that affect infectious disease transmission, public health planners throughout the US can be better prepared for future infectious disease response efforts. This study adds to the growing literature by demonstrating the importance of community-level social vulnerabilities such as minority composition and language barriers associated with poor health outcomes. Understanding why these community-level determinants consistently play such a large role in an individual's health outcomes is key. These findings confirm the usefulness of the SVI in preparing for an infectious disease outbreak. Because the SVI is easily accessible from CDC by census tract, state and local officials can use the index before, during, and after outbreaks occur. The SVI may be used for targeting communities for testing, treatment, health education, and in increasing resources, such as protective equipment, food, and housing assistance. These results also suggest the need to develop culturally and linguistically appropriate mitigation strategies and health education for communities with limited English language proficiency and may face other barriers such as trusting authorities, food insecurity, child care, and housing challenges. Developing partnerships with communitybased organizations and community influencers may help with these challenges. Future studies are needed to confirm placebased influences on infectious diseases and how communitybased interventions can address disparities in health. Disparities in the current COVID-19 pandemic and any future epidemics or pandemics and their root causes must be addressed through bold policy action and societal investment.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary files, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
GT conceived the study, analyzed the data, and wrote the manuscript. MO helped conceive the analysis plan and helped write the manuscript. RR produced the map and participated in writing. J-MY helped conceive the analyses, interpret the results, and participated in writing. AR and MB participated in data curation and writing. MP helped to conceive the research plan and participated in writing. EB oversaw the research, participated in writing, and helped in interpretation of the results. All authors contributed to the present manuscript.