Multiethnic Investigation of Risk and Immune Determinants of COVID-19 Outcomes

Background Disparate COVID-19 outcomes have been observed between Hispanic, non-Hispanic Black, and White patients. The underlying causes for these disparities are not fully understood. Methods This was a retrospective study utilizing electronic medical record data from five hospitals within a single academic health system based in New York City. Multivariable logistic regression models were used to identify demographic, clinical, and lab values associated with in-hospital mortality. Results A total of 3,086 adult patients with self-reported race/ethnicity information presenting to the emergency department and hospitalized with COVID-19 up to April 13, 2020, were included in this study. While older age (multivariable odds ratio (OR) 1.06, 95% CI 1.05–1.07) and baseline hypoxia (multivariable OR 2.71, 95% CI 2.17–3.36) were associated with increased mortality overall and across all races/ethnicities, non-Hispanic Black (median age 67, interquartile range (IQR) 58–76) and Hispanic (median age 63, IQR 50–74) patients were younger and had different comorbidity profiles as compared to non-Hispanic White patients (median age 73, IQR 62–84; p < 0.05 for both comparisons). Among inflammatory markers associated with COVID-19 mortality, there was a significant interaction between the non-Hispanic Black population and interleukin-1-beta (interaction p-value 0.04). Conclusions This analysis of a multiethnic cohort highlights the need for inclusion and consideration of diverse populations in ongoing COVID-19 trials targeting inflammatory cytokines.


BACKGROUND
Reports from the United States, the United Kingdom, and Brazil have highlighted racial disparities in the COVID-19 pandemic (Baqui et al., 2020;Price-Haywood et al., 2020;Oppel et al., 2020;Williamson et al., 2020). National studies from the United Kingdom and Brazil have found race to be an independent predictor of death (Baqui et al., 2020;Williamson et al., 2020). In the United States, Black and Hispanic individuals have disproportionately high rates of infection, hospitalization, and mortality (Hsu et al., 2020;Holtgrave et al., 2020;Price-Haywood et al., 2020;Oppel et al., 2020). These disparities have been attributed to greater representation of Black and Hispanic persons in essential services and a higher burden of comorbidities in minority communities, among others.
While Black and Hispanic individuals in the United States have been disproportionately affected by the pandemic, the majority of published studies investigating COVID-19 mortality risk factors have been in cohorts of individuals with predominantly European or Asian ancestry (Docherty et al., 2020;Gupta et al., 2020;Grasselli et al., 2020;Petrilli et al., 2020;Zhou et al., 2020). Few US studies have directly examined mortality risk factors and their effect sizes in Black or Hispanic as compared to White individuals (Gu et al., 2020;Price-Haywood et al., 2020). Rigorous analysis to establish risk factors and molecular predictors for each population is urgently needed.
We sought to identify race/ethnic-specific clinical and immune factors of mortality using a diverse cohort of White, Black, and Hispanic COVID-19 patients admitted to a single health system in New York. In addition to baseline characterization, we conducted stratified and interaction term analyses to identify risk and immune factors that may affect the outcomes of each patient population. The systematic analyses revealed population-specific effects of multiple risk factors that were previously unknown, highlighting the importance of including diverse patient populations and tailored consideration in precision medicine for COVID-19.

Study Setting
The study was conducted within the Mount Sinai Health System, which is an academic healthcare system comprising 8 hospitals and more than 410 ambulatory practice locations in the New York metropolitan area. This analysis involves patients who presented to five hospitals: The Mount Sinai Hospital (MSH) (1,134 beds), Mount Sinai West (514 beds), and Mount Sinai Morningside (495 beds) in Manhattan; Mount Sinai Brooklyn (212 beds); and Mount Sinai Queens (235 beds).

Data Sources
Data were captured by the Epic electronic health record (Epic Systems, Verona, WI, USA) and directly extracted from Epic's Clarity and Caboodle servers. This de-identified dataset was developed and released by the Mount Sinai Data Warehouse (MSDW) team, with the goal of encompassing all COVID-19related patient encounters within the Mount Sinai system, accompanied by selected demographics, comorbidities, vital signs, medications, and lab values. As part of de-identification, all patients over the age of 89 had their age set to 90.
This study utilized de-identified data extracted from the electronic health record and as such was considered nonhuman subject research. Therefore, this study was granted an exemption from the Mount Sinai institutional review board (IRB) review and approval process.

Patient Population and Definitions
The MSDW dataset captured any patient encounters at a Mount Sinai facility with any of the following: a COVID-19-related encounter diagnosis, a COVID-19-related visit type, a SARS-CoV-2 lab order, a SARS-CoV-2 lab result, or a SARS-CoV-2 lab test result from the New York State Department of Health's Wadsworth laboratory. For this study, patients with COVID-19related visits to the emergency department (ED) on or before April 13, 2020, were identified, and patients who were admitted were selected. Their hospitalization outcomes through June 2, 2020, were observed.
Our analysis was limited to adults over 18 years old who were hospitalized for COVID-19 through a Mount Sinai ED. Selfreported race and ethnicity were classified into 3 mutually exclusive categories: non-Hispanic (NH) White (White), NH Black (Black), and Hispanic (Supplementary Table 1). COVID-19 positivity was determined by a positive or presumptive positive result from a nucleic acid-based test for SARS-CoV-2 in nasopharyngeal or oropharyngeal swab specimens. Baseline vital signs were the first documented vital signs for the encounter. Hypoxia was defined as oxygen saturation of less than 92%. Baseline labs were defined as the first lab value within 24 h of the start of the encounter.

University of Pennsylvania Cohort
Patients in the University of Pennsylvania cohort were identified based on a positive SARS-CoV-2 PCR test. Patients were screened and gave informed consent within 72 h of hospitalization. Clinical data were collected from electronic medical records into standardized case reports. Healthy donors (HDs) had no prior diagnosis or symptoms consistent with COVID-19. Recovered donors (RDs) were adults with a selfreported positive COVID-19 PCR test who recovered as defined by the Centers for Disease Control and Prevention. Cytokine levels were measured from peripheral blood plasma using a custom human cytokine 31-plex panel (EMD Millipore Corporation, Burlington, MA, USA; SPRCUS707), as described in Divij et al. (Mathew et al., 2020)

Logistic Regression
The primary outcome was in-hospital mortality. Univariable and multivariable logistic regression analyses were used to identify factors associated with death. Race/ethnicity-specific risk factors were identified by 1) constructing stratified models for each racial category and 2) constructing models including interaction terms between race/ethnicity and other covariates. Separate interaction models were created to test the interactions of either Hispanic ethnicity or Black race with other covariates. Interactions were compared against the White race as the reference group.
Demographic factors, comorbidities, initial vital signs, baseline lab values, and treatment facility site (Manhattan vs. Brooklyn/Queens) were analyzed as covariates. There was minimal clustering of outcomes by treatment site (ICC(r) = 0.026), and this was modeled as a fixed effect. Covariates were chosen a priori based on prior reports. The odds ratios (ORs) derived from the coefficients of each model were reported, along with the Wald-type confidence interval and p-values.

Laboratory Value Analysis
Markers of inflammation, such as C-reactive protein (CRP), ferritin, and D-dimer, have been proposed as being correlated with COVID-19 severity. However, the missingness of these lab values varied across sites. Given the possibility of confounding by indication (if providers ordered these labs in more acutely ill patients), the analyses involving lab tests were limited to those obtained at the largest site (MSH) and those that had less than 15% missing values at that site. The cytokines interleukin-1-beta (IL-1b), interleukin-6 (IL-6), interleukin-8 (IL-8), and tumor necrosis factor-alpha (TNF-a) were exempted from this threshold because they were obtained on a subset of COVID-19 patients in the context of a study with broad inclusion criteria (Charney et al., 2020;Del Valle et al., 2020).
To test the associations of these lab values with mortality, each lab test was performed in race/ethnicity-stratified multivariable logistic regression models adjusting for age, sex, and hypoxia. The number of covariates in the model was limited due to the reduced sample size. Labs were standardized to a mean of 0 and an SD of 1 prior to regression analysis.

Statistical Analysis
Patient characteristics and baseline vitals and labs were described using medians and ranges for continuous variables and proportions for categorical variables. Continuous variables were compared using the Wilcoxon rank-sum test, and categorical variables were compared using Fisher's exact test. All statistical analyses and data visualizations were carried out using R 4.0.0 (The R Foundation, Vienna, Austria), along with the tidyverse, ggpubr, forestplot, and Hmisc packages (Wickham, 2017;Kassambara, 2020;Gordon and Lumley, 2021;Harrell et al., 2022). Statistical significance was defined as p < 0.05.
The hospitalized cohort included 3,086 adult patients. We excluded 78 patients with missing race or ethnicity data, 144 Asian patients, and 458 patients with other or unspecified race/ ethnicity from the comparative/stratified analyses based on power considerations (Figure 1) Hispanic) and atrial fibrillation (12.3% vs. 5% and 4.5%, p < 0.001 for both) than NH Black or Hispanic patients.
These results demonstrate differences in the distributions of demographic and clinical COVID-19 mortality risk factors by race/ethnicity.  We first evaluated the association of demographic, clinical, and laboratory variables with in-hospital mortality ( Table 2,  Supplementary Tables 4, 5). In a univariate analysis, NH Black (OR 0.73, 95% CI 0.59-0.91) and Hispanic (OR 0.61, 95% CI 0.49-0.76) populations were associated with lower mortality as compared to NH White. However, race/ethnicity was not an independent predictor of mortality after adjusting for age, sex, comorbidities, and baseline hypoxia (oxygen saturation <92% at the first measurement   (Figure 2A). Our finding is consistent with several cohort studies in the United States (Gold et al., 2020;Price-Haywood et al., 2020;Suleyman et al., 2020), although UK and Brazil studies have reported race as an independent predictor of mortality (Baqui et al., 2020;Williamson et al., 2020), possibly due to population differences. Previous patient cohort analyses rarely considered race/ethnicspecific risk factors, which require stratified modeling within each population cohort. We conducted stratified analyses within ethnic groups to determine the population-specific effect sizes of clinical factors and comorbidity. Diabetes was associated with an OR of 1.67 (95% CI 1.09-2.58) in the stratified NH Black population and an OR of 0.99 (95% CI 0.61-1.63) in the NH White population ( Figure 2B). Obesity was associated with an OR of 1.33 (95% CI 0.69-2.58) in NH Black, compared to an OR of 0.72 (95% CI 0.33-1.60) in NH White ( Figure 2B). Increased age and baseline hypoxia were consistently associated with increased mortality across all three populations (Supplementary Table 6). Altogether, these results highlight the shared and specific clinical risk factors of COVID-19 mortality across populations.

Baseline Laboratory Values
We analyzed baseline lab values among patients admitted to the largest site in our dataset, MSH. This site had the most complete records for routine and inflammatory lab values. We defined baseline labs as the first lab value within 24 h of the start of the encounter.
Among common lab values, Hispanic patients had higher baseline alanine aminotransferase values (median alanine transaminase (ALT), 33 vs. 28 U/L, p = 0.03) than NH White patients (Table 3), consistent with the increased prevalence of chronic liver disease among Hispanic patients.
Among inflammatory lab markers, NH Black patients had higher initial levels of procalcitonin (0.29 vs. 0.13 ng/ml, p < 0.001) and more abnormal ferritin levels (3.16 vs. 2.24 times the upper limit of normal, p = 0.03) as compared to NH White patients ( Table 3). There were no significant differences in baseline Ddimer, lactate dehydrogenase (LDH), CRP, IL-1b, IL-6, IL-8, or TNF-a levels. Thus, population-specific associations identified for these immune factors would indicate contributions from their relative differences between patients of different outcomes within a population rather than baseline differences across populations.
To identify immune markers showing population specificity in predicting COVID-19 outcomes, we applied the multivariate regression model in each population-stratified cohort. Elevated levels of IL-1b were associated with a higher risk of mortality in Black (OR 2.35, 95% CI 1.13-4.86) compared to White patients (OR 0.78, 95% CI 0.41-1.51) (Figure 3, Supplementary Table 7 Table 7). To validate the population specificity of these COVID-19 mortality-associated immune markers, we further utilized a multivariate model including interaction terms, finding a significant interaction between NH Black and IL-1b (p = 0.04), and suggestive but nonsignificant interactions between Hispanic and procalcitonin (p = 0.07) and IL-8 (p = 0.09) as compared to NH White (Supplementary Table 6).
Next, we sought to validate the immune marker findings of MSH cohort in an independent dataset. We utilized immunoprofiling data from the University of Pennsylvania cohort (Mathew et al., 2020) to compare levels of serum cytokines and immunologic markers between diverse patient populations vs. HDs and RDs. Among the COVID-19 patients with available race/ethnicity data, eight were NH Black, three were NH White, and four were Asian. Additionally, there were ten HDs and twelve RDs with no available race/ethnicity data. We used the non-parametric Mood's median test to detect potential population differences in the median values of 13 measured immune markers against the combined cohort of HDs and RDs (HDs/RDs, Supplementary Table 8). NH Black patients had significantly higher median IL-6 (37.1 vs. 2.59, p = 0.003) and IP10 (227 vs. 55.6, p = 0.006) and lower median IL12p70 (1.98 vs. 3.41, p = 0.004) levels than HDs/RDs. IP10 was also found to be significant when comparing NH White or Asian patients with HDs/RDs (p < 0.05). Median IL-1b was 4.13 in NH Black patients compared to 2.66 in HDs/RDs (p = 0.115), providing a suggestive yet non-significant association to the finding in MSH cohort.

DISCUSSION
Racial disparities in COVID-19 infections and outcomes have become apparent in both the United States and elsewhere (Baqui FIGURE 3 | Forest plot of multivariable logistic regression results predicting in-hospital mortality using laboratory values, stratified by race/ethnicity. Models were adjusted for age, sex, and baseline hypoxia. White blood cells, 10 3 /ml 984 6.5 (5-9.825) 6.3 (4.9-9.2) 6.4 (5-9.1) Hemoglobin, g/dl 985 13.15 (12-14.125) 12.6 (10.9-13.8) † 13.  Price-Haywood et al., 2020;Oppel et al., 2020;Williamson et al., 2020). The causes of these disparities are complex and multifactorial and must be considered in the context of the social determinants of health (Williamson et al., 2020;Nguyen et al., 2020). In this study, set in New York City during the height of the initial COVID-19 surge, we describe the characteristics and outcomes of a diverse cohort including substantial numbers of NH White, NH Black, and Hispanic patients. The three groups differed significantly in demographic and clinical factors. White patients were older and showed higher rates of cardiovascular disease such as coronary artery disease and atrial fibrillation. NH Black and Hispanic patients were younger and had different comorbidity profiles, e.g., hypertension, diabetes, chronic kidney disease, and chronic liver disease.
Unadjusted in-hospital mortality was the highest in NH White patients, but multivariable analysis showed that race/ethnicity was not an independent predictor of mortality in this cohort. It remains unclear whether race and ethnicity are independent risk factors for COVID-19 mortality after adjusting for confounding factors. Large national-level studies in the United Kingdom and Brazil have reported race as an independent predictor of mortality (Baqui et al., 2020;Williamson et al., 2020), whereas smaller studies in the United States have not (Gold et al., 2020;Price-Haywood et al., 2020;Suleyman et al., 2020), possibly due to statistical power or population differences. Changes in clinical management and outcomes of COVID-19 over the course of the pandemic may also complicate comparisons of results from different time periods (Horwitz et al., 2020). In this New York City patient population, race/ethnicity was not an independent predictor for mortality.
In addition to describing this cohort, we aimed to test established COVID-19 risk factors for race/ethnicity-specific effects. Despite recapitulating several known risk factors, such as age, male sex, and hypoxia, we found only suggestive but nonsignificant interactions between Black race, diabetes, and obesity, and both diabetes and obesity tended to increase the mortality risk of Black patients to a greater degree than White patients. Notably, when analyzing inflammatory markers for their association with mortality, we found a significant interaction between the NH Black population and the inflammatory cytokine IL-1b.
Excessive inflammation has emerged as an important aspect of COVID-19 pathophysiology, and the anti-inflammatory steroid dexamethasone has been shown to improve outcomes among those with severe disease (The RECOVERY Collaborative Group, 2020). The interaction between the NH Black population and IL-1b raises the possibility that differences in immunity may contribute to worse outcomes in some patients. Black Americans are at higher risk of autoimmune conditions such as systemic lupus erythematosus and lupus nephritis as compared to White Americans, with differences that can be linked in some cases to specific polymorphisms, which are more common in African Americans (Ness et al., 2004;Clatworthy et al., 2007;Richman et al., 2012;Freedman et al., 2014).
IL-1b is a pro-inflammatory cytokine that plays a role in both physiologic and pathologic inflammation. IL-1 inhibitors such as anakinra and canakinumab have been developed to target IL-1 in autoimmune diseases such as rheumatoid arthritis and Still's disease. These agents are also being actively investigated as COVID-19 treatments. Thus far, two randomized studies have found no clinical benefit from IL-1 inhibitors in COVID-19 [Novartis Provides Update on CAN-COVID Trial in Hospitalized Patients With COVID-19 Pneumonia and Cytokine Release Syndrome (CRS) (Novartis); Tharaux et al., 2021]. However, these studies have not reported analyses taking race/ethnicity into account.
The strengths of our database include its size and the inclusion of 37.1% Hispanic patients, a vulnerable population in this pandemic, which has been underrepresented in the literature to date. Additionally, our near-complete follow-up of the cohort's hospital outcomes (99.3%) strengthens the validity of our findings.
Our study has limitations that warrant specific mention. The cytokine analysis was limited to only a subset of the population and should be considered exploratory. We were not able to control for other comorbidities, which may have influenced cytokine levels (e.g., diabetes and IL-1b) (Dinarello et al., 2010). The dataset was derived from the electronic health record database without manual review, which may limit the completeness of comorbidity labels. Race and ethnicity were selfreported and were missing or unspecified in 17.4% of the initial cohort. The subset of patients with cytokine data was limited in number and had limited models testing interactions. Causal mechanisms underlying the correlations identified in this retrospective analysis remain to be elucidated.
In conclusion, our analysis of a diverse cohort drawn from the New York metropolitan area highlights both similarities and important differences across racial/ethnic groups in risk factors for death among hospitalized COVID-19 patients. The findings identified across populations call for conscious inclusion in future cohort studies and clinical trials to ensure the efficacy of potential diagnostics and treatments across diverse individuals.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
KH, EW, and TJ conceived and designed the study. DM collected and generated the data from the University of Pennsylvania cohort. SN and PK collected and organized the clinical data from Mount Sinai. TJ, NS, and H-HH analyzed the data. TJ and K-lH drafted the manuscript. All authors read and approved the final manuscript.

FUNDING
This work was supported by NIGMS R35GM138113 to K-lH.