Proteomic aptamer analysis reveals serum biomarkers associated with disease mechanisms and phenotypes of systemic sclerosis

Background Systemic sclerosis (SSc) is an autoimmune connective tissue disease that affects multiple organs, leading to elevated morbidity and mortality with limited treatment options. The early detection of organ involvement is challenging as there is currently no serum marker available to predict the progression of SSc. The aptamer technology proteomic analysis holds the potential to correlate SSc manifestations with serum proteins up to femtomolar concentrations. Methods This is a two-tier study of serum samples from women with SSc (including patients with interstitial lung disease - ILD - at high-resolution CT scan) and age-matched healthy controls (HC) that were first analyzed with aptamer-based proteomic analysis for over 1300 proteins. Proposed associated proteins were validated by ELISA first in an independent cohort of patients with SSc and HC, and selected proteins subject to further validation in two additional cohorts. Results The preliminary aptamer-based proteomic analysis identified 33 proteins with significantly different concentrations in SSc compared to HC sera and 9 associated with SSc-ILD, including proteins involved in extracellular matrix formation and cell-cell adhesion, angiogenesis, leukocyte recruitment, activation, and signaling. Further validations in independent cohorts ultimately confirmed the association of specific proteins with early SSc onset, specific organ involvement, and serum autoantibodies. Conclusions Our multi-tier proteomic analysis identified serum proteins discriminating patients with SSc and HC or associated with different SSc subsets, disease duration, and manifestations, including ILD, skin involvement, esophageal disease, and autoantibodies.


Introduction
Systemic sclerosis (SSc) is a connective tissue disease characterized by a pathogenetic triad composed of microangiopathy, immune system activation, and fibrosis.SSc is associated with significant morbidity and mortality largely dependent on the early detection of organ involvement including skin, lung, gastrointestinal tract, and heart (1)(2)(3).
Serum autoantibodies are the only available biomarkers for precision medicine in SSc as anti-centromere antibodies (ACA) are commonly observed in the limited form (lcSSc) and associated with an increased risk of pulmonary arterial hypertension (PAH) while anti-RNA polymerase III antibodies are linked to scleroderma renal crisis and anti-topoisomerase I (anti-Scl-70) antibodies are observed in the diffuse form (dcSSc) with a higher incidence of interstitial lung disease (ILD) (4,5).However, autoantibodies are of limited use to predict the onset of specific complications (6)(7)(8); other biomarkers, such as uric acid and NT-ProBNP are instrumental to assess the risk of having PAH (9), KL-6 may predict the progression of SSc-ILD (10), while the combination of serum NT-ProBNP and troponins may be evocative of myocardial involvement (11,12).
In recent years, proteomic analysis has been broadly used to identify biomarkers as it illustrated by the transcriptional profile changes associated with specific disease manifestations (13-19).Aptamers are short single-stranded oligonucleotides capable of folding into various structures, and have the ability to bind to proteins, peptides, and small molecules at concentrations ranging from femtomolar to micromolar, with high reproducibility and low variability rates (20) and holds promising in SSc to elucidate the molecular mechanisms of disease pathogenesis and internal organ involvement (21).We took advantage of proteomic aptamer analysis and different validation cohorts of SSc patients with and without ILD to identify biomarkers for SSc phenotypes.

Methods Patients
Six patients who met the 2013 European League Against Rheumatism/American College of Rheumatology classification criteria for SSc (22) were enrolled from the Scleroderma Unit, Rheumatology and Clinical Immunology, IRCCS Humanitas Research Hospital in Rozzano, Italy.Clinical data were recorded, including organ involvement based on clinical features, laboratory results, and findings from diagnostic imaging (radiological imaging, echocardiography, lung function tests, and other relevant examinations).SSc-ILD was diagnosed based on high resolution computed tomography (HRCT) scans, following the guidelines set by the Fleischner Society (23).PAH was defined using right-heart catheterization in the presence of suspected/suggestive signs, symptoms, echocardiographic abnormalities, or pulmonary function abnormalities (24).Myocardial involvement was diagnosed based on compatible findings at cardiac magnetic resonance (25) in patients with suspected/suggestive signs or symptoms, electrocardiographic and 24-hour Holter ECG alterations, or elevated serum myocardial enzymes (11,12).Serum autoantibodies were assessed using commercially available kits in routine laboratory analysis.Early SSc was defined as a disease duration less than 3 years from the onset of the first non-Raynaud manifestation (26).Patients were naïve to any immunosuppressive or vasoactive treatment.Seven healthy controls (HC) were enrolled (2 females, 5 males, median age 54 years, interquartile 46-67 years).
We also used two separate and independent cohorts of SSc patients and HC to validate our findings, as described below.
In all cases and controls, serum samples were collected using serum separator tubes, allowed to clot, aliquoted, and stored at -80°C.
This study was conducted in accordance with the Declaration of Helsinki, and the research protocol was approved by the local ethics committee (study number 831); all subjects provided their informed consent.

Aptamer proteomic analysis
Serum aliquots of 150 mL were prepared from all collected samples and subjected to proteome profiling using a high throughput multiplexing aptamer-based SOMAscan ® assay, targeting 1310 serum proteins (Somalogic Inc., Boulder, CO), as previously described (27).The technique utilizes a panel of proteinspecific Slow Off-rate Modified DNA aptamers (SOMAmers), which are constructed from chemically modified nucleotides capable of specifically recognizing and binding proteins with high specificity and affinity.The SOMAmer reagents are labeled with a 5'fluorophore and a biotin, immobilized on streptavidin coated beads, and incubated with serum samples.Complexes consisting of SOMAmer reagents and target proteins are formed on the beads.The SOMAmer reagents are then quantified through fluorescence using microarrays containing specific sequences.The relative intensity of fluorescence correlates to the amount of protein in the original sample.

Validation with ELISA tests
Proteins that showed significantly different concentrations at the proteomic aptamer analysis (SomaSuite, vide infra) were selected for further validation and clinical correlations based on their pathogenic significance (as supported by previous literature) or if their relative fluorescence intensity was notably elevated.For validation, solid-phase enzyme immunoassays, ELISA tests, were employed to quantitatively determine the levels of the most relevant proteins in separate cohorts of SSc patients and HC.The following kits were used for ELISA testing: Human CD177, Eotaxin1, Leptin, Angiopoietin 2 (Ang2), Kininogen HMW, TPSB2, MMP12, IL-22BP (Raybiotech, USA), and Human Calpain 1, Aldolase A, BAFFR, Fractalkine, and Calgranulin B (S100 A9) (Mybiosuorce, Vancouver Canada), following the manufacturer's instructions.All candidate proteins were first validated in an independent cohort of 30 patients with SSc and 10 HC (8 women, median age 71 years, interquartile range 62-74 years).Three proteins (Ang2, IL-22BP, and TPSB2) that exhibited statistical significance for multiple variables and displayed critical and innovative correlations with disease pathogenesis were further validated in an expanded cohort consisting of 89 patients with SSc and 43 HC (35 females, 8 males, median age 65 years, interquartile range 50-70 years).

Statistical analysis
Data analysis of the SomaLogic results was conducted using SomaSuite software (Somalogic, Boulder, CO, USA).Statistical analysis of the ELISA data was performed using Stata16 software (StataCorp.2019.Stata Statistical Software: Release 16.College Station, TX: StataCorp LLC).Non-parametric data were analyzed using the Wilcoxon rank-sum (Mann-Whitney) test for individual comparisons.To account for multiple comparisons, the Kruskal-Wallis correction was applied.A significance level of p < 0.05 was considered statistically significant.Variables that reached statistical significance in the univariate analysis were entered into a logistic regression analysis with both forward and backward stepwise selection procedure to identify independent risk factors for selected variables.
Pathway analysis was performed by Reactome peer-reviewed pathway database (https://reactome.org)through the overrepresentation analysis: a statistical test (hypergeometric distribution) that determines whether certain Reactome pathways are over-represented (enriched) in the submitted data.The probability score was corrected for false discovery rate using the Benjamani-Hochberg method.

Patients
The characteristics of the three cohorts of patients, namely the SOMAscan, first ELISA validation, and extended ELISA validation groups, are illustrated in Table 1, with slight differences in terms of rarer organ involvements.Among the six patients studied with the SOMAscan proteomic aptamer analysis, all were female, with a median age of 64.5 years (interquartile range -IQR 41-73).One (17%) had dcSSc, 3 (50%) were ACA-positive, and 2 (33%) were anti-Scl-70 positive.Three patients (50%) had ILD and 5 (83%) had gastrointestinal involvement, while no patient had PAH.

Proteomics of SSc and SSc-ILD using aptamers
Our first-tier proteomic analysis using SOMAscan technology revealed significant differences in serum levels of 33 proteins between patients with SSc and HC (Table 2).Patients with SSc exhibited altered expression of proteins involved in various biological processes, including extracellular matrix formation and cell-to-cell adhesion (elevated Calpain, EphA5, IDS, MATN2, MMP-12, TNR4, and reduced levels of desmoglein-1, SNP25), angiogenesis (increased levels of anti-angiogenic factors such as Ang2 and high molecular weight kininogen), lymphocyte recruitment, activation, and signaling (elevated levels of CXCL-1, LAG3 and decreased levels of SH21A), and overall inhibition of neutrophil function (decreased levels of G-CSF-R, CD177, calgranulin B; Figure 1).Nine proteins differentiated patients with SSc-ILD and without ILD or HC (Table 3).SSc-ILD patients showed elevated serum levels of proteins involved in intracellular signaling and cell cycle regulation (FCRL3, PDE11, Stratifin), as well as increased levels of MCP-3, a monocyte chemoattractant, and sICAM-5, the ligand for leukocyte adhesion protein LFA-1, compared to patients without ILD.Patients with SSc-ILD exhibited higher levels of IL-22BP, the decoy receptor for IL-22, and lower levels of BAFF.

Validation of protein biomarkers and association with disease manifestations
In the second-tier analysis, we selected 13 proteins from the first analysis to be validated in the independent cohort (Tables 4-7).

Discussion
The complex pathogenesis of SSc encompasses aberrant inflammation, dysregulated fibrosis, and microvascular disease (2).Comprehensive proteomic analysis is an ideal approach for identifying relevant molecules involved in various disease mechanisms, including potential prognostic factors, predictive molecules, and therapeutic targets.Proteomic analyses have already been performed and associated to clinical phenotype in SSc (29)(30)(31)(32), as shown by the report that the higher expression of CXCL4 from peripheral blood and skin plasmacytoid dendritic cells in association with the incidence and progression of ILD and PAH in SSc (33).In addition, altered serum levels of collagen IV, endostatin, IGFBP-2, IGFBP-7, MMP-2, neuropilin-1, NT-proBNP, and RAGE have been described in SSc-PAH patients (34).Further, patients with a prominent signature based on CD40 ligand, CXCL4, and anti-PM/Scl-100 antibodies have shown a preferential positive response to treatment with the tyrosine kinase inhibitor imatinib (35).
Only a few studies have been conducted using the aptamer tools for proteomics in SSc (36-39).Within the studies investigating the preclinical phase of SSc (36, 39) one identified three proteins involved in the dysregulated angiogenesis and fibrosis being differentially expressed in patients with preclinical SSc at risk of evolving into overt disease, thus confirming the importance of microvascular disease in the earliest phases of SSc pathogenesis (36).Piera-Velazquez and Colleagues elegantly demonstrated that proteomic analysis of serum exosomes differs between patients with primary Raynaud's phenomenon and patients with Raynaud's phenomenon at risk of evolving into SSc (39).In the other two studies available (37, 38), the aptamer analysis was applied to describe longitudinal changes in patients with established SSc and organ damage.It was indeed demonstrated that serum levels of ST2 and spondin-1 predicted the changes in mRSS, also providing evidence of a peculiar cytokine signature (i.e., TNF, IFN-g, TGFb, and IL-13) (37).Additionally, chemerin was identified as a potential biomarker with pathogenic significance for increased pulmonary vascular resistances in patients with SSc-PAH (38).In another study, 82 proteins were found to be differentially expressed in sera from SSc-PAH patients compared to SSc patients without lung vascular involvement, including an IFN-g signature and two other proteins of interest, Midkine (implicated in the pathogenesis of arterial hypertension, renal disease, and lung fibrosis) and Follistatin-like 3 (FSTL3, regulated by TGF-b) in association with SSc-PAH (40).A composite three-biomarker index (including Ca15-3, surfactant protein D, and ICAM-1) has been recently described to predict ILD in patients with SSc, and is associated to disease severity (41).Several reasons could explain the differences found in the proteomic profile between previous studies and our results.First, the types of samples, such as whole blood, serum, and exosomes, is expected to provide different analytical outcomes.Second, the correlation between the duration of the disease  history and the timing of sample collection might have played a role, not only in the case of early vs. longstanding SSc but also in patients with isolated Raynaud's phenomenon at risk of evolving towards established disease.Third, such results ultimately reflect the heterogeneity of SSc, indicating various internal organ involvements, as well as multiple nuances of disease severity and rates of disease progression.
Our multi-tier study including an aptamer-based analysis and a validation on independent cohorts supports the alterations in different biomarkers to reflect abnormal extracellular matrix formation, angiogenesis, vascular remodeling, and immune cell recruitment and function, which recapitulate the fundamental pathogenic aspects of SSc and some of the proposed biomarkers (Ang2, IL-22BP, and TPSB2) require a detailed discussion.Angiopoietin-2 (Ang2) is a vascular growth factor secreted by endothelial cells that induces their own activation and promotes leukocyte chemotaxis.In the presence of a pro-inflammatory cytokine environment, this signaling pathway leads to vascular instability and endothelial inflammation (42).By stimulating the release of IL-6 and IL-8 from monocytes, Ang2 enhances the inflammatory-driven fibrogenic process, a hallmark of SSc (43).Our analysis revealed significantly increased serum levels of Ang2 in SSc patients compared to HC, which is consistent with previous literature findings (44).Furthermore, we observed significantly higher levels of serum Ang2 in early SSc and SSc-ILD patients compared to HC.Previous studies have shown that serum Ang2  levels decrease after treatment with intravenous cyclophosphamide in SSc-ILD patients, and this reduction correlates with concentrations of KL-6, an established biomarker of lung involvement (45).Overall, these results support the crucial role of aberrant angiogenesis across the pathogenic processes underlying SSc since the earliest phases, throughout the development of established disease, along with providing further support for the correlation between serum Ang2 concentrations and SSc-ILD.IL-22 is an inflammatory cytokine produced by CD4+ T cells and innate T cells, including NKT, gd T cells, and innate lymphoid cells (ILC) (46); IL-22BP is a soluble decoy receptor that acts as an IL-22 inhibitor (46).Ambivalent proinflammatory and modulating functions have been attributed to both IL-22 and IL-22BP in a tissue-and disease-dependent manner (47)(48)(49).A protective antiinflammatory effect of IL-22 has been demonstrated in the case of pulmonary inflammation, with lower levels being detectable in the bronchoalveolar lavage fluid (BALF) of patients with acute respiratory distress syndrome and sarcoidosis (50).Moreover, IL-22 is essential to allow alveolar repair following Influenza pneumonia (51), while elevated IL-22BP expression increases the risk of severe pulmonary infections (52).On the other hand, a prominent IL-22-based inflammatory signature has been described in patients with SSc (53), with increased circulating Th22 cells (54) and serum IL-22 levels being associated with SSc-ILD (55).Expression of IL-22 in scleroderma skin is linked to both the inflammatory (56) and fibrotic responses that are responsible for disease progression (57).In our study, serum levels of IL-22BP were significantly increased in patients with SSc-ILD compared with HC.This points towards a role for reduced IL-22 function in the pathogenesis of SSc-ILD.A mouse model study reported that bleomycin-induced lung fibrosis leads to a decrease in IL-22, and administering exogenous IL-22 can inhibit the inflammatory and fibrotic process (58).We speculate that the protective role of IL-22 may be compromised in patients with SSc-ILD and modulating the IL-22/IL-22BP system could be a promising therapeutic target.Further studies are needed to clarify the role of cells producing IL-22 in the pathogenesis of scleroderma lung disease, with a particular focus on innate-like lymphocytes (59), which represent an intriguing crossroad between the environment, innate, and adaptive immunity.
Conflicting evidence has been reported regarding matrix metalloproteinase (MMP)-12, an enzyme with critical functions in extracellular matrix remodeling in animal models of lung fibrosis (60, 61).In vitro studies have shown that dermal fibroblasts from SSc patients overexpress MMP-12, thus affecting angiogenic homeostasis (62).Increased levels of MMP-12 in serum and tissue have been reported in SSc patients, and these levels are associated with longer disease duration and more severe skin and lung involvement (63).Furthermore, the rs2276109 polymorphism of the MMP-12 gene has been linked to SSc susceptibility in a large Italian cohort (64).Our results partially contrast with previous evidence.While we initially observed higher MMP-12 levels in SSc patients during SOMAscan analysis, we found significantly lower values during ELISA validation, especially in SSc-ILD, early SSc, and anti-Scl-70 or anti-RNA polymerase III positivity, suggesting a potential correlation between serum MMP-12 levels and milder forms of the disease.Such observations could suggest the presence of an ineffective extracellular matrix turnover, as reflected by lower levels of MMP-12, in those patients with a higher burden of fibroinflammatory lesions.This is notably the case of individuals with SSc-ILD, as well as rapidly progressive cutaneous fibrosis associated with anti-RNA polymerase III antibodies.However, due to the discrepancy of previous evidence, further research is warranted also in this case to clarify these associations.
Dysfunction of the myeloid cell compartment has been implicated in both the inflammatory and fibrotic phases of SSc pathogenesis (65).To support this view, we found significant differences in serum concentrations of myeloid-derived proteins in SSc patients compared to HC, including calgranulin B, and CD177.Calgranulin B is a calcium-binding protein expressed in neutrophils, monocytes, and macrophages, and it is overexpressed in the lungs of patients with idiopathic pulmonary fibrosis and nonspecific interstitial pneumonia (66).Our analysis showed that SSc-ILD patients and dcSSc patients have higher circulating levels of calgranulin B. CD177 is a neutrophil membrane molecule involved in the regulation of diapedesis (67), and CD177+ neutrophils produce large amounts of IL-22 (68).As mentioned earlier, the ambivalent role of IL-22 may help explain the mild reduction in soluble CD177 that we observed in patients with early SSc.Further research is required to elucidate the role of the myeloid compartment in the pathogenesis of different subsets of SSc.
Two proteins associated with gastrointestinal involvement in patients with SSc are eotaxin and fractalkine.Eotaxin's role in recruiting eosinophils and mast cells has been extensively studied in asthma (69), and its pro-fibrotic effects have been demonstrated in both animal models (70) and human conditions (71,72).Recently, Piera-Velazquez and colleagues demonstrated that patients with early SSc have higher levels of eotaxin in circulating exosomes compared to subjects with primary Raynaud's phenomenon (39).We are the first to report that increased serum levels of eotaxin and lower serum concentrations of fractalkine are significantly associated with esophageal involvement in patients with SSc.
Leptin warrants also a discussion as this is a metabolic hormone produced by adipose tissue cells and has potential, albeit conflicting, roles in autoimmune inflammation (73) and fibrosis (74).Its effects on the fibrotic process appear to be tissue-or organ-dependent (74).Contradictory data have been reported on serum leptin concentrations in patients with SSc (75-78), and it has been suggested that these variations may reflect heterogeneity in disease duration, activity, and different phenotypes and endotypes.We are the first to report low levels of leptin in patients with SSc sine scleroderma compared to subjects with cutaneous involvement (both lcSSc and dcSSc).
To our knowledge, this is the first study to investigate serum proteins using proteomic aptamer analysis in a deeply phenotyped and endotyped cohort of SSc patients to assess the potential pathogenetic roles, spanning from extracellular matrix formation, angiogenesis, and immune cell homing and function.The strength of our study is that patient sera were obtained at the time of diagnosis, prior to any immunosuppressive or vasoactive treatment initiation, therefore our findings are expected to accurately represent the serum proteome of treatment-naïve individuals with SSc.Functionality assays, including gene expression and epigenetic studies, could serve as powerful tools to test and enhance the pathogenic validity of our observations.For instance, by studying the modulation of angiogenic pathways (mainly represented by Ang2 in our dataset), extracellular matrix remodeling (such as MMP-12), or myeloid cell function, we could gain a deeper understanding of whether these processes play a "diseasemodifying" role at various stages of the disease or in different organs.Silencing IL-22 in mice-models of SSc lung disease may prove helpful to understand if IL-22 has a different role in the lung compared to the skin.Functionality analysis could reveal critical and potentially practice-changing information while the checkpoint driving dysregulated fibrosis could be intercepted, or it might be revealed that targeting certain pathways (e.g., aberrant angiogenesis, lymphocyte activation) is crucial but only in certain disease phases.Among the limitations of our study, the small sample size used for the aptamer analysis and the validation performed on a larger cohort should be noted, as well as the arbitrary choice of the candidate proteins, based on their presumed pathogenic significance and available data from previous literature.Furthermore, while the composition of the validation cohorts adequately reflects the distribution of different disease subsets and organ manifestations, this is not the case for the SOMAscan cohort due to the low prevalence of dcSSc, absence of subjects with SSc sine scleroderma, anti-RNA polymerase III antibodies, and PAH.

Conclusions
Serum and tissue proteomics offer valuable tools for characterizing various aspects of the disease, aligning with the principles of precision medicine, while prospective validation of these biomarkers is warranted.The potential biomarkers that distinguish patients with SSc from HC identified in this work play functional roles in extracellular matrix metabolism, angiogenesis, and immune cell function, which are critical checkpoints in the pathogenesis of the disease.Biomarkers related to altered angiogenesis can differentiate patients with early SSc from HC, while other molecules exhibit differential expression in patients with SSc depending on factors such as disease subset, autoantibody profile, extent of skin fibrosis, and internal organ involvement, including ILD.

FIGURE 1
FIGURE 1Pathways and interactions (https://reactome.org) of proteins showing significantly different (p < 0.05) serum levels in SSc patients compared with HC, as assessed using the aptamer (SomaLogic) proteomics platform.

TABLE 1
Demographic features of SSc patients analyzed by SomaLogic, the initial and extended validation cohorts.

TABLE 2
Significantly different (p < 0.05) protein serum levels in SSc patients compared with HC as assessed using the aptamer (SomaLogic) proteomics platform.

TABLE 3
Significantly different (p < 0.05) protein serum levels in SSc with ILD compared with SSc without ILD and HC as assessed using SomaLogic proteomics.

TABLE 5
Serum concentration -median (IQR)-of validated proteins in SSc patients, according to the cutaneous subset.

TABLE 4
Serum concentration -median (IQR)-of validated proteins in HC, and SSc according to the presence or absence of SSc-ILD.
p > 0.05 comparing SSc without ILD vs HC.

TABLE 6
Serum concentration -median (IQR)-of validated proteins in SSc patients, according to disease duration and organ involvement.

TABLE 7
Serum concentration -median (IQR)-of validated proteins in SSc patients, according to autoantibody profile.