Identification of inflammatory clusters in long-COVID through analysis of plasma biomarker levels

Mechanisms underlying long COVID remain poorly understood. Patterns of immunological responses in individuals with long COVID may provide insight into clinical phenotypes. Here we aimed to identify these immunological patterns and study the inflammatory processes ongoing in individuals with long COVID. We applied an unsupervised hierarchical clustering approach to analyze plasma levels of 42 biomarkers measured in individuals with long COVID. Logistic regression models were used to explore associations between biomarker clusters, clinical variables, and symptom phenotypes. In 101 individuals, we identified three inflammatory clusters: a limited immune activation cluster, an innate immune activation cluster, and a systemic immune activation cluster. Membership in these inflammatory clusters did not correlate with individual symptoms or symptom phenotypes, but was associated with clinical variables including age, BMI, and vaccination status. Differences in serologic responses between clusters were also observed. Our results indicate that clinical variables of individuals with long COVID are associated with their inflammatory profiles and can provide insight into the ongoing immune responses.


Introduction
A large proportion of individuals infected by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) have one or more symptoms persisting for several months (1,2).These symptoms can be wide-ranging and most commonly include fatigue, pulmonary abnormalities, neurological impairments, and reduced mobility (3).These post-acute sequalae of SARS-CoV-2, also known as "long COVID", have risk factors including female sex, older age, minority racial group, and higher body mass index (BMI) (4,5).Long COVID is less likely to develop in vaccinated individuals (6,7), and vaccination may improve long COVID symptoms (8)(9)(10)(11).Presently, mechanisms underlying long COVID remain poorly understood.Hypothesized mechanisms include immune dysregulation, an inadequate anti-SARS-CoV-2 serologic response, the persistence of a viral reservoir, and the development of autoantibodies (1).
Immune dysregulation has been studied extensively in acute cases of severe COVID-19, with a range of inflammatory markers associated with increased risk of progression to more severe disease (12)(13)(14).In the context of long COVID, a variety of immune changes in have been demonstrated including higher levels of IL-4 and IL-6 producing CD4+ T cells, and higher serum levels of IL-1b, IL-6, TNFa, and IP-10 suggesting chronic immune activation (15)(16)(17), but no consistent pattern of immune abnormalities has been identified.An increasing body of research has focused on defining clinical phenotypes of long COVID given the marked heterogeneity in clinical presentation, but whether different inflammatory profiles underlie these clinical phenotypes is less well studied (18,19).
To better understand the unique immune patterns in individuals with long COVID, we used an unsupervised hierarchical clustering approach to derive inflammatory clusters in a cohort of individuals with long COVID and explored the association of these profiles with symptoms and clinical variables.

Participants and sample collection
This study was approved by the Rush University Institutional Review Board, Office of Research Affairs# 20032309.Patients were enrolled for this study at Rush Post COVID clinic, where samples were collected.All enrolled participants were included in this study.Samples were then centrifuged to isolate plasma which was stored at -80 degrees C. Participant electronic medical records (EMRs) were reviewed and demographic data including age, BMI, gender, vaccination status, vaccination type, days since last vaccine dose, race, WHO initial disease severity, diabetes status, and days since symptom onset were collected.We used EMR review to assess the presence of 12 symptoms: fatigue, shortness of breath, palpitations, chest pain, brain fog, joint pain, myalgia, headaches, gastrointenstinal (GI) symptoms, dizziness, cough, and anosmia.We selected these symptoms as they have been used to define long COVID symptom clusters previously (18,20).

Measurement of binding antibody
Binding IgG against Spike and Nucleocapsid proteins of SARS-COV-2 were measured using the V-PLEX SARS-CoV-2 Panel 25 IgG Kit (Catalog No. K15583U-2) and V-PLEX COVID-19 Coronavirus Panel 2 IgG Kit (Catalog No. K15369U-2 from Meso Scale Discovery (Rockville, Maryland).Assays were performed according to manufacturer instructions.Results were analyzed using MSD Discovery Workbench 4.0.12.and reported in arbitrary units (AU)/mL.Levels of 1000 AU/mL for anti-spike and 5000 AU/mL for anti-nucleocapsid were used as cut-offs for positivity based on historical controls collected at our institution prior to the pandemic.

Statistical analysis
We used principal component analysis (PCA) to reduce dimensionality of the 42 biomarker levels.PCs representing 80% of variance were then hierarchically clustered using Ward's method with a Euclidian distance measure to construct biomarker derived clusters.To derive symptom clusters, we adopted methodology which has been described in detail elsewhere (18,20).Briefly we used multiple correspondence analysis (MCA) to remove dimensionality of the dataset, and hierarchical clustering on the results of the MCA using squared Euclidean distance and Ward's minimum variance linkage.
We used univariable and multivariable unadjusted multinomial logistic regression models to explore associations of biomarker clusters, considering variables that have been associated with risk of long COVID including age, sex, ethnicity, BMI, vaccination status, days since symptom onset, initial disease severity, and diabetes.Categorical variables including symptom prevalence, gender, initial disease severity, vaccination status, ethnicity, and diabetes status were analyzed using the chi-square test.Continuous variables including age, BMI, days since last vaccination dose, and days since symptom onset were analyzed using the Kruskal-Wallis test.Age, sex, and ethnicity were included in multivariate models a priori, in addition to demographics variables that were found to be significantly different amongst clusters.All analysis was carried out using R version 4.1.2and GraphPad Prism version 9.5.0.

Study cohort
A total of 101 individuals were recruited for this study from January 2021 to August 2021.97 (96%) of individuals were confirmed positive for COVID-19 infection by a PCR test, pointof-care rapid test, or antigen test, and the remainder were confirmed to have been previously infected by the presence of a positive nucleocapsid antibody on serologic testing.Median duration of symptoms was 160 (IQR: 104-251.50)days preceding sample collection None of the individuals had been vaccinated at the time of infection.62 (61.39%) of individuals received at least one dose of vaccination before sample collection.31 (30.69%) of the individuals were male and the cohort had a median age of 47 (IQR 36-55) years.

Plasma biomarkers reveal three distinct inflammatory clusters
Principal component analysis and hierarchical clustering revealed three distinct biomarker clusters (Supplementary Figure 1).The first cluster was the largest of the three clusters with 40 (39.60%)individuals.Overall, this cluster was characterized by low levels of inflammation and innate immune activation, and thus was termed the limited immune activation cluster.Compared to the other two clusters, this cluster had lower levels of immune markers including CD163, CRP, sICAM-1, E Selectin, IFNb, LBP, b-D-Glucan, D-Dimer, and TPO, and chemokines MIP-3b and CCL-1 (all p <0.05, Figure 1A).This cluster was the youngest, with a median age of 39.00 (IQR: 29.25 -52.00) years and had the lowest BMI (median 25.55 (IQR: 21.77 -30.38)) of the three clusters (Table 1).
The next cluster consisted of 33 (32.67%) individuals and presented with higher levels of markers of innate immune activation compared to the low inflammatory response cluster.Levels of G-CSF, MIP-1b, MMP-9, and TIMP-1 were elevated in this cluster compared to both other clusters, and IL-28B/IFNl3, VEGF, and IL-17A were elevated in this cluster compared to the limited immune activation cluster (all p <0.05) (Figure 1B).This cluster was termed the innate immune activation cluster as G-CSF and MIP-1b are key innate immune cytokines, and MMP-9 leads to release of innate immune cytokines in its role in the degradation of extracellular matrix (21).This cluster was older (median age 47.00 (IQR: 39.50 -56.00) years), had a higher BMI [median 32.24 (IQR: 28.34 -37.24)] than the limited immune activation cluster.It also had the lowest number of individuals who had been vaccinated prior to sample collection (24.24%) and had the shortest duration of symptoms at the time of sample collection (median 127.00 (IQR: 86.00 -171.50)days from symptom onset) (Table 1).
The last cluster, which we termed a systemic immune activation cluster, comprised the remaining 28 (27.72%)individuals and showed high levels of most markers including systemic inflammation markers IFNg, TNFa, IL-6, pro-inflammatory chemokines IL-8, IP-10, ITAC, MCP-1, and MIP-1a, growth factor GM-CSF, and the innate inflammatory marker IL-1a.The anti-inflammatory marker IL-10 was also upregulated (Figure 1C).However, PTX3, another innate immune cytokine was lower than in the innate immune cluster.This cluster was the oldest (median age 51.50 (IQR: 41.75 -58.00) years), although age was not significantly different than the innate immune activation cluster (p >0.05).Similarly, BMI (median 31.21(IQR: 25.11 -35.96)) was significantly higher than the limited immune activation cluster, but not significantly different to the innate immune activation cluster.This cluster had the highest proportion of individuals that had been vaccinated prior to sample collection (92.86%) and was sampled longest post symptom onset of the three clusters (median 202.00 (IQR: 128.00 -290.00)days) (Table 1).

Inflammatory profile does not correlate with long COVID symptoms
We next investigated the correlation between the prevalence of symptoms and inflammatory clusters.Shortness of breath and fatigue were the most prevalent symptoms among the cohort, occurring in 72.23% and 48.51% respectively.GI symptoms and joint pain, which were present in 7.92% and 8.91% were the least prevalent.MCA and hierarchical clustering revealed 3 symptoms clusters, similar to those demonstrated elsewhere (Supplementary Figure 2, Supplementary Table 1) (18,20).The first cluster [n=17 (16.83%)], a musculoskeletal (MSK) or pain symptom cluster, was characterized by high levels of headache (64.71%), brain fog (82.35%), joint pain (41.18%), myalgia (47.06%) and fatigue (82.35%) compared to the other two clusters, and had the highest number of symptoms with a median of 6 (IQR: 5-6) symptoms per participant.The second cluster [n=47 (46.53%)], a cardiorespiratory cluster, had high proportions of shortness of breath (100%), chest pain (59.57%) and palpitations (42.55%) compared to the other two clusters, and had a median of 3 (IQR: 2-4.5) symptoms per participant.The last cluster, a less symptomatic cluster [n=37 (36.63%)], had a median of 1 (IQR: 1-2) symptoms per participant, and cough (29.72%), was the only symptom more prevalent in this cluster compared to the other two clusters.
However, looking at the correlation with inflammatory clusters, we found no significant difference in the prevalence of symptom cluster between inflammatory cluster, nor significant differences in the prevalence of any of the 12 symptoms between inflammatory clusters (all p >0.05 Table 2).

Association between clinical variables and inflammatory clusters
To further investigate the association between clinical variables and cluster membership, we constructed multinomial logistic regression models considering variables that may affect host inflammatory response (Table 3).
In univariate analysis, older age and higher BMI were associated with an increased odds of being in the innate immune activation cluster (OR for age: 1.05, 95% CI: 1.01-1.09,p= 0.027, OR for BMI: 1.11, 95% CI: 1.04-1.20,p= 0.004), while vaccination status and increasing time from symptom onset were associated with reduced odds of being in the innate immune activation cluster when compared to the limited immune activation cluster (vaccination OR: 0.14, 95% CI: 0.05-0.39,p<0.001, time from symptom onset OR 0.99, 95%CI: 0.99-1.00,p= 0.015).After multivariate adjustment, vaccination remained significantly associated with a reduced odds of being in the innate immune activation cluster (OR: 0.09, 95% CI: 0.02-0.38,p= 0.001) while BMI remained associated with higher odds of belonging to this cluster (OR: 1.15, 95%CI: 1.04-1.27,p= 0.006).

Differences in serologic response between clusters
We next measured anti-spike and anti-nucleocapsid levels in available samples (n=99).Anti-spike antibodies were detectable in 98 (98.99%) individuals: 38 out of 39 (97.44%) in the limited immune activation cluster (median 195,000 AU (IQR: 28,900 AU -448,000 AU), 32 out of 32 (100.00%) in the innate immune activation cluster (median 123,000 AU (IQR: 27,400 AU -298,000 AU), and 28 out of 28 (100.00%) in the systemic immune activation cluster (median 364,000 AU, IQR: 19,400 AU -650,000 AU).Antispike antibody levels were significantly greater in the systemic immune activation cluster when compared to both other clusters (Figure 2A), as expected given the higher vaccination rate.

Discussion
In this study, we identify three distinct inflammatory profiles in individuals with long COVID.These profiles featured contrasting levels of inflammation and had unique clinical and demographic associations.The first cluster featured low levels of inflammation, was the youngest and had the lowest BMI.The second cluster, an "innate immune activation" cluster was independently associated with higher BMI and the last cluster, a "systemic inflammation" cluster was independently associated with older age.However, we did not observe differences in symptom profile across inflammatory clusters, emphasizing the need to control for clinical variables when investigating the pathophysiology of long COVID.
Previous studies have proposed systemic inflammation as a cause of long COVID, and have implicated cytokines that were measured in this study.Elevation in serum levels of IL-1b, IL-6, and TNFa have been suggested as a hallmark of long COVID (15), but this study did not present the demographic differences between those with and without long COVID.Another study which included age and BMI matched SARS-CoV-2 convalescent controls, demonstrated higher IP-10 and TNFa in early recovery (a median of 52 days post symptom onset) and higher IL-6 in late recovery (a median of 124 days post symptom onset) in long COVID (17).At the early time point, IL-6, IFNg, IL-10 and TNFa were higher in those with the greatest number of symptoms.Despite higher levels of these cytokines in the systemic immune activation cluster, symptoms were not more frequent in this analysis.More recently, a study demonstrated an inflammatory subcluster of long COVID, with upregulated IL-8, IL-6 and IL-1b, similar to the systemic inflammation cluster observed in our study, but once again this cluster was older and a clear correlation with increased symptoms was not demonstrated (22).Overall, the role of inflammatory cytokines in long COVID symptoms remains uncertain.
We observed higher levels of nucleocapsid antibodies in the innate immune activation and systemic inflammation cluster compared to the limited immune activation cluster.Nucleocapsid antibodies correlate with initial disease severity but have also been used as a proxy measurement for the existence of a persistent reservoir of viral antigen, which could drive some of the inflammation seen in these two more inflamed clusters (23,24).Interestingly, these clusters had different rates of vaccination, even when controlling for potential confounders.Studies have shown that vaccination against SARS-CoV-2 may reduce symptoms of long COVID (6, 7), with clearance of a viral reservoir proposed as a mechanism for this improvement (25).Here, the highest vaccination level was seen in the most inflamed group.Whether this could represent an could reflect an inflammatory response that would lead to clearance of a viral reservoir requires further study.Reassuringly, there was no increase in symptoms seen in this more vaccinated group.
Unsupervised clustering of symptoms demonstrated a painpredominant cluster, a cardiorespiratory cluster, and a less symptomatic cluster.Previous studies in a long COVID cohort based in Ireland have found similar symptom phenotypes and demonstrated greater functional impact associated with the pain and cardiorespiratory phenotypes (18).While the replication of these phenotypes across independent populations supports the potential for their clinical relevance, we did not identify differences in inflammatory profile.A number of other causes have been hypothesized to cause long COVID, including changes in levels of B cells and T cells, presence of autoantibodies, reactivated viruses, and inadequate antibody production (1).Further research is needed to determine whether these factors are associated with clinical phenotype.
Our study, while providing valuable insights, had certain limitations that warrant consideration.Firstly, we acknowledge that our analysis was constrained by the small sample size, particularly in the systemic immune activation cluster where only two unvaccinated individuals were included.This limited representation may have limited our ability to model the effect of vaccination on cluster membership.Moreover, symptoms that have been previously observed to be commonly associated with long COVID showed a relatively low prevalence in this cohort, with fatigue being present in only 48.51% of individuals and brain fog in 33.66%.This may indicate that individuals in the cohort had a mild presentation of long COVID compared to other published cohorts, and may have limited the ability to detect differences in symptoms between inflammatory clusters.Finally, this study is cross sectional, preventing analysis of inflammatory phenotype on trajectory of long COVID symptoms.
In summary, this study demonstrates inflammatory profiles in a long COVID cohort and the association with clinical variables.Further research is needed to determine the pathophysiologic changes that underlie different long COVID presentations.

A B
Differences in anti-spike and anti-nucleocapsid response between clusters (A) Anti-Spike and (B) Anti-Nucleocapsid levels observed in cohort showing differences observed between three inflammatory clusters.Cluster 1: limited immune activation cluster, cluster 2: innate immune activation cluster, and cluster 3: systemic immune activation cluster.

TABLE 2
Prevalence of symptoms.

TABLE 1
Demographic differences between clusters.

TABLE 3
Logistic regression model results.