Social Determinants Predict Outcomes in Data From a Multi-Ethnic Cohort of 20,899 Patients Investigated for COVID-19

Importance: The COVID-19 pandemic exploits existing inequalities in social determinants of health (SDOH) in disease burden and access to healthcare. Few studies have examined these emerging disparities using indicators of SDOH. Objective: To evaluate predictors of COVID-19 test positivity, morbidity, and mortality and their implications for inequalities in SDOH and for future policies and health care improvements. Design, Setting, and Participants: A cross sectional analysis was performed on all patients tested for COVID-19 on the basis of symptoms with either a history of travel to at risk regions or close contact with a confirmed case, across the Mount Sinai Health System (MSHS) up until April 26th 2020. Main Outcomes and Measures: Primary outcome was death from COVID-19 and secondary outcomes were test positivity, and morbidity (e.g., hospitalization and intubation caused by COVID-19). Results: Of 20,899 tested patients, 8,928 tested positive, 1,701 were hospitalized, 684 were intubated, and 1,179 died from COVID-19. Age, sex, race/ethnicity, New York City borough (derived from first 3 digits of zip-code), and English as preferred language were significant predictors of test positivity, hospitalization, intubation and COVID-19 mortality following multivariable logistic regression analyses. Conclusions and Relevance: People residing in poorer boroughs were more likely to be burdened by and die from COVID-19. Our results highlight the importance of integrating comprehensive SDOH data into healthcare efforts with at-risk patient populations.


INTRODUCTION
The novel coronavirus 2019  caused by the SARS-CoV-2 virus, has triggered twin crises in public health and healthcare, resulting in over 18.8 million confirmed cases and 350,000 deaths globally. COVID-19 was first confirmed in the United States on January 31st 2020 in Washington State. Outbreaks of COVID-19 were subsequently reported in California and New York.
Published data characterize the resulting COVID-19 disease as one exploiting existing inequities in social determinants of health (SDOH) (1). The World Health Organization (WHO) defines SDOH as the conditions in and under which individuals are born, grow, work, and live, and the broader set of forces and systems (e.g., political, social, and economic policies and systems, social norms, and societal institutions) that shape the conditions and quality of daily life (2). U.S. Black/African Americans and Hispanic/Latinx are more likely to be diagnosed and experience COVID-19-related morbidities and mortality, especially those living in poor and crowded housing conditions, having preexisting health comorbidities, low/limited incomes, or "essential" occupations (3). Data also reveal rapidly increasing inequities in disease burden among other minorities including Native American/American Indians (NA/AI) (3). Yet, few studies have examined these emerging inequities using indicators of SDOH (4). This study examines associations between COVID-19 test positivity, morbidity, and mortality and potential indicators of SDOH including patient-and neighborhood level variables and discusses implications of these drivers of health inequities for policies, research, and healthcare improvements.

MATERIALS AND METHODS
All patients who underwent testing for SARS-CoV-2 across the Mount Sinai Health System (MSHS) from March 28, 2020 to April 26th 2020 were included in this cross-sectional analysis (N = 20,913). Those with incomplete data were excluded (14 without a documented sex), or categorized as "Other/Unknown" (1,745 with unknown Race/Ethnicity, three for who zipcode/NYC borough was absent), or "Not asked" (669 were documented as "not-asked" for smoking status). SARS-CoV-2 testing was performed in MSHS through respiratory specimens that were evaluated by real-time reverse transcription polymerase chain reaction (RT-PCR) methods. Testing was performed on patients who had fever or signs/symptoms suggestive of respiratory illness and either a history of travel to affected areas (China, Japan, Italy, South Korea, and Iran), or close contact with a confirmed case of COVID-19 infection in the prior 2 weeks.
The MSHS Ethics Committee approved a waiver of documentation of informed consent; de-identified patient data was obtained from the MSHS Data Warehouse (https://msdw. mountsinai.org/). Data included demographics, behavioral, and clinical variables. In our analysis SDOH were defined in line with the 2017 WHO definition (2). Similar to Gottlieb et al. researchers selected specific sociodemographic characteristics within the database as proxies for SDOH, which have well-known significance within the social/behavioral science literature, to "translate" patient-level data into population-level data. Specifically, we used a consensus-building process involving our team of health disparities researchers, biostatisticians, and medical experts to develop these proxies of SDOH and health disparities in the existing dataset (e.g., age, gender, race/ethnicity, borough derived from first three digits of zip code of patient's place of residence, smoking status, and English as a preferred language; each of which are discussed further below). As suggested by Bazemore et al., we also utilized census information from the American Community Survey (ACS) to introduce neighborhood-level information such as median income by NYC borough (5,6). The resulting SDOH proxies and markers of health disparity were included in models used in the current study.
We compared characteristics of patients according to the following outcomes: SARS-CoV-2 RT-PCR result (tested positive) for all patients who presented for testing, and of those who tested positive: hospitalization, intubation and mortality. Data were summarized as medians (interquartile range) for continuous variables and frequency (%) for categorical variables, where appropriate. We tested for bivariate associations between demographic, behavioral, and clinical variables using Chi-square (χ 2 ) tests for categorical variables and Wilcoxon Rank Sum tests for the continuous variable "Age." Patient characteristics are summarized in Table 1.
Multivariable logistic regression was performed for each of the above listed outcomes including all of the listed variables to control for potential confounding; as each co-variate was considered to be of clinical and/or social relevance. Statistical analyses were performed using R Statistical Software (type I error rate of 0.05) (7).

RESULTS
The study population consisted of 20,899 patients who underwent testing for SARS-CoV-2; of whom 8,928 tested positive, 1,701 were hospitalized, 684 were intubated, and 1,179 died from COVID-19. Median age in the overall population was 54 years and 50.2% were male. Patient's place of residence was distributed as follows: 44.3% Manhattan, 21.8% Brooklyn, 16.6% Queens, 7.2% The Bronx, 2.3% Long Island, 1.0% Staten Island, and 6.7% "Other." In univariate analysis nearly all evaluated predictors were significantly associated with the four Covid-19 outcomes. Positive test status was significantly associated with patients who were older, male, racial/ethnic minorities, current smokers, nonprimary English speakers, and had comorbidities (e.g., asthma, hypertension, and diabetes; See Table 1). Patients living in Queens had significantly higher test positivity than those testing positive from all other boroughs in the MSHS catchment area. Hospitalization results mirrored those of test positivity, with notable exceptions. COVID-19-positive patients with chronic obstructive pulmonary disease (COPD) and those living in the Bronx and Manhattan were significantly more likely to be hospitalized; non-primary English speakers were not ( Table 1). Once hospitalized, the COVID-19-positive patients who were Categorical variables are presented as frequencies (%), and age is presented as mean with standard deviation. Borough of residence is presented as percentage of column for the overall population and percentage of row for each of the outcomes (test positivity, hospitalization, intubation, and COVID-19 mortality). We tested for bivariate associations between demographic, behavioral, and clinical variables using Chi-square (χ 2 ) tests for categorical variables and Kruskal-Wallis tests for continuous variables.
significantly more likely to be intubated, were older; male; non-primary English speakers; residents of Queens; and had COPD, hypertension, diabetes, or chronic kidney disease. Those COVID-19-positive patients who died were significantly more likely to be older; male; White; former smokers; residents of Brooklyn and Queens; and comorbid with COPD, hypertension, diabetes, and chronic kidney disease ( Table 1). Table 2 and shows the adjusted odds of each of the outcomes: increasing age and male gender are significantly associated with higher odds  Table 2). While residents of Brooklyn and Long Island had significantly higher risk of testing positive, and being intubated than those resident in Manhattan, residents from Queens did not have a higher risk of testing positive. Adjusting for each of the other social determinants and co-morbidities as listed in Table 2, those who speak English as their preferred first language have a significantly lower risk of testing positive, being intubated and death from COVID-19 than those who do not speak English as their preferred language. Comorbidities such as hypertension, obesity and chronic kidney disease were significantly associated with a higher risk of mortality from COVID-19 on adjusted analyses, while those diagnosed with COPD, diabetes and HIV were not. Those with a diagnosis of asthma had significantly lower risk of a positive test and mortality from COVID-19 while those with a cancer diagnosis had significantly lower risk of each outcome ( Table 2).

Multivariable regression analyses are presented in
There is an inverse relationship between mortality rate and the median household income for the borough from which patients reside (Figure 1), where lower median income corresponds with higher mortality rate.

DISCUSSION
COVID-19 continues to transform daily life for patients and those involved in patient care, highlighting health disparities and further confirming the link between SDOH and health outcomes. Our study is the first to demonstrate an impact of multiple indicators of SDOH and health disparities on COVID-19 test positivity, morbidity, and mortality outcomes in a patient population. We describe demographic, clinical, socioeconomic, and behavioral predictors of COVID-19 test positivity, morbidity, and mortality, at a time when more than 30% of all the US cases of COVID-19 were in New York (4). Our results demonstrate that each of the 6 SDOH and health disparities indicators examined in this study contributed to statistically significant worse outcomes for each of the study endpoints. This work highlights the importance of considering SDOH when caring for patients during the COVID-19 pandemic.
Our novel finding supporting the growing role of SDOH in the SARS-CoV-2 pandemic is that zip code (place of residence) and English as preferred first language is predictive of COVID-19 outcomes. Borough of residence has previously been described as showing strong associations with health literacy and income (8). Preferred first language is indicative of acculturation and is also a marker of health practices (9,10). Prior research on infectious diseases pandemics has demonstrated that SDOH-related inequities create conditions for disease transmission and unequal burdens of disease morbidity and mortality (11). The significance of neighborhood environments and their relationship to heath maybe due to clustering of poverty with other forms of disadvantage. Figure 1 demonstrates that lower median income corresponds with higher mortality rate. While the relationship between a social determinant of health such as zip-code, and median household income is easier to examine and conceptualize, the relationship is more complex. There are multiple relevant constituents to consider for each individual social determinant we examined in this study; some which may overlap. Age, sex, race/ethnicity have been addressed in the context of COVID-19 outcomes, where older, male and African-American and Hispanic/Latinx patients have been documented to have worse outcomes (4,(12)(13)(14)(15)(16). Furthermore, there is a known association between smoking prevalence and location of residence in New York City (17). Smoking status itself is a health behavior which has been shown to magnify health disparities in groups with low incomes and without employment (18). An example of the potential overlap of this indicator of SDOH (i.e., zip code) and medical co-morbidities is the presence of respiratory conditions; for example asthma, which is exacerbated by pollution and allergens which are more common in disadvantaged neighborhoods. Other potentially relevant factors which could contribute to this finding relating to zip code/ place of residence, include food insecurity, commute times, educational attainment, housing density, number of persons per household, race/ethnicity, proximity to healthcare, occupation and healthcare literacy, which are all relevant components affecting how place of residence may impact health outcome (5,(19)(20)(21)(22). There are complex relationships between social factors and health and so the causal roles of some social factors are not without controversy (23).
In line with prior studies, our findings provide insight into the intersectional, and unequal, effects of SDOH on COVID-19 testing, morbidity, and mortality, and highlight a need for including SDOH in COVID-19 studies as a means of contextualizing and interpreting such data (24). The COVID-19 pandemic is highlighting that one's zip code matters more than one's genetic code (25). In light of our findings, an index combining all potential patient-and community-level COVID-19 vulnerability indicators would be a valuable tool. Such a tool could guide healthcare and public health efforts to identify clinically-and community-modifiable targets for intervention. To facilitate development, public health and healthcare researchers and practitioners will need to work in tandem to integrate SDOH in electronic medical records (EMR) for use in such analyses, while also quickly implementing policies and procedures to enhance standardized collection of this data (26). While there is much literature describing the overlap between SDOH and co-morbidities, there are those who advocate that SDOH are themselves co-morbidities, and if they were considered as such, this could facilitate their broader documentation and collection (27).
This study has limitations including a study population drawn from one metropolitan area in the Northeast U.S. and de-identified data collected from an EMR database. First, generalizability may be low since trends in data from New York City and state may not mirror those in other affected areas (e.g., New Orleans, Chicago). Additionally, proximity to the MSHS facilities may have created a selection bias that influenced surveillance and outcomes data (e.g., positive tests were unequally represented across boroughs based on socioeconomic status). However, a strength of this study is that it is drawn from a region with uniform criteria for testing and likewise with uniform restrictions and regulations imposed to reduce the spread of COVID-19. Further, given that EMR are not designed with SDOH data collection as a primary objective, the latter represents a challenge. Study researchers could not include a precise, comprehensive set of SDOH in the analyses due to issues with balancing privacy and analysis in de-identified datasets for analysis (e.g., personal income). Further, complex SDOH indicators such as health literacy or neighborhood-level factors may have ICD-10 Z codes or require combinations of several Z codes. However, healthcare organizations may lack standardized measures for collecting this data; and codes may be imprecise in representing specific SDOH or inconsistently utilized in healthcare encounters. Thus, there is underrepresentation of SDOH in EMR (e.g., neighborhood-level social and environmental factors), and lack of uniform utilization of existing ICD-10 Z codes for complex SDOH. Future studies linking SDOH and clinical characteristics reflected in patient population's unique social and economic experiences (e.g., community norms and functioning, housing density, and food deserts) are needed to draw more stringent associations between SDOH and potential COVID-19 patient and subgroup outcomes among medically and socially vulnerable groups (e.g., smoking history and mortality).

CONCLUSION
To our knowledge, this cross-sectional analysis represents the first large-scale analyses of multiple SDOH and indicators of health disparity in COVID-19 surveillance and clinical outcomes among patients in the New York City metropolitan area. Our results demonstrate differences in outcome based not just on race/ethnicity, age, gender, median household income, and borough of residence, but also preference of English as a first language.
This study highlights the importance of integrating comprehensive SDOH data into public health and healthcare efforts with at-risk patient populations and communities to improve the quality of COVID-19 prevention, surveillance, management, and policies.

DATA AVAILABILITY STATEMENT
The data analyzed in this study is subject to the following licenses/restrictions: Dataset remains the property of Mount Sinai Data Warehouse: msdw.mountsinai.org. Requests to access these datasets should be directed to msdw.mountsinai.org.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Mount Sinai Hospital IRB. Written informed consent from the participants' legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements. The MSHS Ethics Committee approved a waiver of documentation of informed consent as de-identified patient data was utilized in this study.

AUTHOR CONTRIBUTIONS
DL, HG, NM, and AT: study conception and design. DL, BK, and AT: acquisition of data. HG, AL, NM, and DL: drafting the manuscript. BK and AT: revising the manuscript critically for important intellectual content. All authors: analysis and interpretation of data. All persons who meet authorship criteria are listed as authors and all authors certify that they have participated sufficiently in the work to take public responsibility for the content, including participation in the concept, design, analysis, writing, or revision of the manuscript.

ACKNOWLEDGMENTS
We would like to acknowledge the contributions of the following, without whom this work would not have been possible: the staff of Mount Sinai Hospitals for their dedication and care in documenting the care of and caring for patients; the Department of Scientific Computing at the Icahn School of Medicine at Mount Sinai for the collation of this data.