Determination of CSF GFAP, CCN5, and vWF Levels Enhances the Diagnostic Accuracy of Clinically Defined MS From Non-MS Patients With CSF Oligoclonal Bands

Background Inclusion of cerebrospinal fluid (CSF) oligoclonal IgG bands (OCGB) in the revised McDonald criteria increases the sensitivity of diagnosis when dissemination in time (DIT) cannot be proven. While OCGB negative patients are unlikely to develop clinically definite (CD) MS, OCGB positivity may lead to an erroneous diagnosis in conditions that present similarly, such as neuromyelitis optica spectrum disorders (NMOSD) or neurosarcoidosis. Objective To identify specific, OCGB-complementary, biomarkers to improve diagnostic accuracy in OCGB positive patients. Methods We analysed the CSF metabolome and proteome of CDMS (n=41) and confirmed non-MS patients (n=64) comprising a range of CNS conditions routinely encountered in neurology clinics. Results OCGB discriminated between CDMS and non-MS with high sensitivity (85%), but low specificity (67%), as previously described. Machine learning methods revealed CCN5 levels provide greater accuracy, sensitivity, and specificity than OCGB (79%, +5%; 90%, +5%; and 72%, +5% respectively) while glial fibrillary acidic protein (GFAP) identified CDMS with 100% specificity (+33%). A multiomics approach improved accuracy further to 90% (+16%). Conclusion The measurement of a few additional CSF biomarkers could be used to complement OCGB and improve the specificity of MS diagnosis when clinical and radiological evidence of DIT is absent.


INTRODUCTION
There remains no single pathognomonic clinical feature or diagnostic test for MS. Diagnosis relies on the integration of clinical, imaging, and laboratory findings within the framework of the McDonald criteria. The present criteria recognise the need to demonstrate clinical and/or radiological dissemination in space (DIS) and time (DIT) and to exclude alternative diagnoses. In 2017, revisions to the McDonald criteria introduced the presence of cerebrospinal fluid (CSF)-specific oligoclonal IgG bands (OCGB) as a proxy of DIT to establish MS diagnosis in patients with a typical clinically isolated syndrome (CIS) and with only evidence of DIS. The OCGB appear to arise from CSF-persistent, clonally related B cell populations, which appears, at least partially, independent of B cell targeted therapy (1). While, these revisions have increased the sensitivity of the McDonald criteria, to facilitate earlier treatment, they have reduced the specificity for clinically definite MS (CDMS) (2).
While CSF OCGB are present in a high proportion of individuals with CDMS (3), they are also detectable in the CSF of individuals with other autoimmune and infectious diseases of the CNS including syndromes with clinical and radiographic overlap with MS (4,5), and in other non-inflammatory neurological diseases such as migraine (6). Indeed, migraine, fibromyalgia, psychogenic disorders, and neuromyelitis optica spectrum disorders, have all been highlighted as the true cause of illness in patients misdiagnosed with MS (7)(8)(9). A recent meta-analysis gave a sensitivity of 84% and a specificity of 54% when using OCGB to predict conversion to CDMS, with a corresponding positive predictive value of 0.64 and a NPV of 0.77 (10). Therefore, when DIT cannot be proven clinically or radiologically, other biomarkers would be useful to improve the specificity of MS diagnosis.
We have previously shown that we can distinguish between individuals with MS and healthy controls with very high accuracy (100%) using NMR-based metabolomics on blood samples (11). However, except for radiologically isolated syndrome (RIS), it is rare that the subject of an investigation for suspected demyelinating disease is healthy on presentation. Thus, the need to distinguish between individuals with MS and the heterogeneous cross-section of non-MS patients encountered in neurology clinics is the normal challenge. A multi-omics approach on samples CSF combined with cross platform multivariate pattern recognition methods affords an opportunity to identify new biomarkers for MS diagnosis.
Here we sought to discover if a CSF-based multivariate diagnostic test combining proteomics, metabolomics, and OCGB could improve the diagnostic accuracy of MS from the heterogeneous mix of neurological diseases encountered in a clinical setting. Using an integrative approach, we looked for diagnostic biomarkers that are independent of OCGB and highly specific for CDMS. Such biomarkers could be added to the 2017 McDonald criteria, alongside highly sensitive OCGB, to improve diagnostic specificity and the positive predictive value in patients where DIT cannot be proven clinically or radiologically. We report a multivariate model which out-performs, not only OCGB status, but all identified metabolite and protein biomarkers when measured in isolation. All models were validated on independent test data using 10-fold cross validation and permutation testing to ensure significance was not a result of the model overfitting the data.

Study Participants
CSF samples from 41 patients with CDMS (Poser criteria (12)) and 64 patients with non-MS diagnoses (with spinal taps performed as part of their diagnostic investigations) were collected at the Department of Neurology, University Hospital Basel to identify biomarkers specific for CDMS that are independent of OCGB status. Those with a non-MS diagnosis were chosen to represent the heterogenous range of neurological conditions observed in a typical generalist neurology clinic including epilepsy, functional neurological disorders, primary headache syndromes, inflammatory neurological conditions, and infections amongst others (

CSF Sample Collection
CSF samples were centrifuged at 400 × g for 10 minutes at room temperature, and the cell-free supernatant stored at -80°C within 2 hours of collection according to a consensus protocol (13). Standard laboratory procedures measured leukocytes [cells/mm 3 ] and total protein concentration [mg/dL]. CSF/serum albumin was calculated using concurrent serum samples. Detection of OCGB was by isoelectric focusing on agarose gel and subsequent immunoblotting using IgG-specific antibody staining (14). Patterns two or three were considered OCGB positive (15). NMR Spectroscopy and Data Processing for Metabolomics Analysis 100 µL of CSF was diluted with 450 mL of 75 mM sodium phosphate buffer D 2 O (pH 7.4) containing 1 mM maleic acid as an internal reference standard. Samples were centrifuged at 3,000 x g for 5 minutes before transferring to a 5-mm NMR tube. NMR spectra were acquired at 310 K using a 700-MHz Bruker AVIII spectrometer operating at 16.4 T equipped with a 1 H [ 13 C/ 15 N] TCI cryoprobe (Department of Chemistry, University of Oxford) and processed as previously described (16).
NMR metabolite measures were converted to absolute concentrations using the internal reference standard (1 mM maleic acid) as previously described (16). To validate the quantification of the metabolites by NMR, the glucose and lactate levels in all CSF samples were measured using a Cobas ® 8000 modular analyser (Roche Diagnostics, Switzerland) coupled with the Gluc3 and LAC2 assays, respectively.

Statistical Analysis
Multivariate orthogonal partial least squares discriminant analysis (OPLS-DA) was performed in R software (R foundation for statistical computing, Vienna, Austria) (R Development Core Team, 2019) using in-house R scripts and the ropls package (19). All models were validated on independent data using an external 10-fold cross-validation strategy with repetition coupled with permutation testing as previously described (20). Thus, it should be noted that all models are tested on data that was excluded from model building and that the training and test cohorts never overlap. An in-depth description of this analysis approach can be found in our previous publication (21). Variables responsible for the observed class separation are extracted by inspection of the average variable importance (VIP) scores.
Two-sample t-tests or two-way ANOVA were used for continuous variables and Chi-square tests for categorical variables. A multiple comparisons correction (Bonferroni) was applied throughout. Receiver operator curves (ROC), area under the curve (AUC), 95% confidence intervals, optimal thresholds for diagnosis, and p values (relative to a null distribution ROC curve with AUC =0.5) were calculated for each discriminatory variable using the pROC package (22). Hierarchical clustering was performed on the discriminatory proteins to identify clusters similarly expressed and correlated proteins using the 'pheatmap' and 'corrplot' packages. Joint-pathway hypergeometric enrichment analysis was performed on the discriminatory proteins and metabolites identified by the multivariate analysis (described above) using MetaboAnalyst 5.0 [http://metaboanalyst.ca, last accessed 05/05/21]. Degree centrality was used as the topology measure along with the combined queries integration method.

RESULTS
High Sensitivity and Low Specificity of CSF OCGB When Discriminating Between Clinically Definite MS and Other Non-MS Neurological Diseases CSF samples from 105 patients seen in the neurology clinic were investigated: 41 with a diagnosis of CDMS and 64 with a confirmed non-MS diagnosis. Demographic and clinical chemistry data can be found in Table 1. Thirty-five of 41 (85%) CDMS patients were positive for OCGB, in the non-MS set this was the case for 21 of 64 patients (32%) ( Figure 1A). Thus, while the sensitivity of OCGB status alone is high (85%) the specificity in this cohort is only 67% resulting in an overall accuracy and AUC of 74% and 0.74, respectively.  independently of OCGB status, age, and non-MS diagnosis ( Figure S1) Models identified CDMS and non-MS patients in the test data with accuracy, sensitivity, and specificity of 70 ± 4%, 73 ± 6%, and 70 ± 4% respectively ( Figure 1B) and permutation testing confirmed that these values are significantly higher than expected from random chance alone ( Figure S2). Inspection of the VIP scores illustrated that leucine, isoleucine, glutamine, citrate, creatine, creatinine, glucose, and myo-inositol are significantly decreased in CDMS (Table 2). Interestingly, all the metabolites identified discriminate between CDMS and the other neurological conditions independently of OCGB status as confirmed by two-way ANOVA in which no significant interaction between diagnosis (CDMS/non-MS) and OCGB status was observed for any of the biomarkers. Furthermore, no metabolite reached significance when comparing the levels in OCGB+ve samples to OCGB-ve and the OPLS-DA was able to separate MS from non-MS irrespective of OCGB status ( Figure S1B).

CSF Myo-Inositol, Isoleucine, Leucine, and Glutamine Levels Discriminate Between CDMS and Non-MS Neurological Conditions With Greater Specificity Than OCGB Status Alone
The diagnostic utility of each identified metabolite biomarker was investigated using ROC analysis to identify the optimum metabolite concentration cut-off. Metabolite biomarkers ranked by specificity are shown in Table 2. Four of the metabolite biomarkers identified (myo-inositol, isoleucine, leucine, and glutamine) have higher specificity than OCGB. While CSF myo-inositol concentrations provide the same AUC as OCGB status (0.74), sensitivity is sacrificed (49%, -36% compared to OCGB status) in favour of specificity (89%, +22%), suggesting that this metabolite is useful for discriminating between CDMS and other neurological conditions which are OCGB+ve. Indeed, myo-inositol levels correctly identified 19 (90%) of the non-MS OCGB+ve cohort as non-MS. Figures 1C-J illustrates the improved positive predictive power of myo-inositol, isoleucine, and leucine, particularly in the non-MS OCGB+ve patients.

The CSF Proteomics Profile of CDMS Is Distinct From That of Non-MS Neurological Conditions and Independent of OCGB Positivity
Next, we investigated the CSF proteomics profiles of the CDMS and non-MS patients. OPLS-DA was able to discriminate between CDMS and other neurological conditions with accuracy, sensitivity, and specificity of 75 ± 4%, 75 ± 4%, and 77 ± 5% respectively independently of OCGB status, age, and non-MS diagnosis ( Figure S3). This is an improvement on the specificity of OCGB alone which is 67%. Once again, the permutation test confirmed these values are significantly higher than expected from random chance alone ( Figure S4).

CCN5, vWF, and GFAP CSF Levels Outperform OCGB Status Alone for the Discrimination of CDMS and Non-MS Neurological Conditions
VIP scores identified 40 significantly perturbed proteins driving the separation between CDMS and non-MS patients (Figure 2A) of which 13 discriminated CDMS from non-MS with a greater AUC than OCGB (Table S1). A significant OCGB effect was present in several IgG-associated proteins (Table S1). immunoglobulin G1 (IGHG1), nephronectin (NPNT), and methyl-CpG-binding domain protein 1 (MBD1) were significantly increased in the OCGB positive patients relative to OCGB negative (irrespective of MS diagnosis) while IgG-receptor (FCGR1A) was decreased.

JAK-STAT Pathways Are Upregulated in CDMS Patients While BCAA Degradation and Tyrosine Metabolism Are Down Regulated
Hierarchical clustering reveals four highly correlated groups of proteins within the 40 biomarkers identified which separate the CDMS patients from non-MS independently of OCGB status ( Figure 3A). In contrast, fewer significant correlations were observed between the metabolite and protein hits ( Figure 3B).
No correlations were observed between any of the metabolite hits and IgG associated proteins, consistent with our earlier observation that the identified discriminatory metabolites are independent of OCGB status.
Integrative metabolomics and proteomics enrichment analysis revealed several potentially perturbed pathways in the CDMS group ( Figure 4A). Of note, upregulation of the JAK-STAT and glycolysis pathways is consistent with an increased inflammatory response and perturbed energy metabolism in the CDMS cohort.  OCGB status provided the greatest overall accuracy when discriminating between CDMS and non-MS CSF samples ( Figures S5A-D). Indeed, OPLS-DA analysis using only these 4 variables resulted in an accuracy, sensitivity, and specificity or 91%, 89%, and 92% respectively on independent data which is significantly higher than the accuracy, sensitivity, and specificity of OCGB alone (+16%, +4%, and +25% respectively). The multivariate model out-performed, not only OCGB status, but all identified metabolite and protein biomarkers when measured in isolation ( Figure 4B). For the OCGB-ve patients, the multiomics model correctly identified one (15%) of OCGB-ve CDMS patient as MS while all 43 (100%) OCGB-ve non-MS patients remained correctly identified as non-MS. However, the greatest clinical utility of this approach is in the identification of non-MS patients who test positive for OCGB (improved PPV). Indeed, the model correctly identified 16 (76%) OCGB+ve non-MS patients ( Figure 4C) and 35 (100%) of OCGB+ve CDMS patients. This illustrates that the addition of the identified proteins to complement the use of OCGB status, within the context of the McDonald criteria, could greatly improve specificity and PPV without sacrificing sensitivity or NPV in the instance when DIT cannot be demonstrated radiologically or clinically, although this remains to be confirmed in a cohort of early MS patients. Interestingly, vWF may be substituted for either myo-inositol or TMEM40 in the multi-omics model with no significant drop in accuracy (Table S3)

DISCUSSION
Here, we have identified a small number of independent biomarkers, using feature selection coupled with a pattern recognition multivariate analysis framework and multiomics data, that may be used to support a diagnosis of MS with an accuracy of 90% (91% in the OCGB+ve patients and 90% in the OCGB-ve patients). The combination of CCN5, vWF, GFAP, and OCGB provides a significant increase in PPV; the model is able to accurately discriminate between OCGB+ve CDMS and OCGB+ve non-MS neurological conditions. While this combination yielded the highest accuracy, it is of note that other combinations could produce similar accuraciesfor example, the substitution of myoinositol for vWF. In each case, the levels of the newly identified biomarkers -CCN5, GFAP, and vWF -were independent of OCGB levels and outperformed OCGB alone. CCN5 (previously known as WISP-2), which was decreased in the CSF of CDMS patients, is a member of the connective tissue growth factor/cysteine-rich 61/nephroblastoma overexpressed (CCN) family which play important roles in cell growth, adhesion, and migration. However, the function of CCN5 is not well understood. In an EAE model of MS, CCN5 mRNA was found to be significantly upregulated in spinal cord tissue, but as the tissue was collected in end-stage disease, its relevance to early-stage disease in MS is questionable (23).  Interestingly, a significant positive correlation has been reported between levels of the related protein CCN3 in matched plasma and CSF of MS patients, which was absent in a comparator group of idiopathic intracranial hypertension patients (24). CCN3 plays various roles in the immune system and CCN5 and CCN3 have been reported to have antagonistic effects (25). CCN3 is a regulator of cytokine expression in both the periphery and CNS (26), where it can promote astrocyte activation (27), but it is not clear what effect CCN5 has on these processes. The connection between the biology of multiple sclerosis and vWF, which was decreased in the CSF of CDMS patients compared to non-MS, is less opaque; vWF and/or Weibel-Palade bodies negatively regulate BBB permeability changes in MS-like lesions (28). It is not clear why the levels should be reduced in the CSF of MS patients, but one might speculate that increased release from endothelial cells into the blood might lead to a reduction in the CSF.
For GFAP, which was increased in the CDMS group, the connection with MS pathogenesis is now well established. Here, GFAP provided 100% specificity (+33% relative to OCGB), which suggests that measurement of this protein biomarker could be particularly useful in diagnosing MS in OCGB positive patients. GFAP is highly specific for astrocytic damage, and, as astrogliosis is a central component in MS pathogenesis, it is perhaps unsurprising that this protein should be highlighted in this analysis. However, it is important to note that other conditions have also been shown to be associated increase GFAP levels in the CSF. During acute NMO exacerbations, for example, CSF GFAP levels are significantly elevated (29) and after trauma (30). GFAP levels also increase with age, but, following an adjustment for age, MS patients have been shown to have higher GFAP levels compared with controls, and the adjusted levels correlate with neurological disability and disease progression (31). In studies on early MS, it has been reported that there are no significant differences in CSF GFAP levels between CIS and RRMS, but that GFAP does seem to be a good biomarker for highly active CNS inflammation in patients with CIS and RRMS (32). This would appear to highlight the need for a small panel of biomarkers rather than relying on one or another to aid in a diagnosis. Myo-inositol is a component of all cell membranes and oligodendrocyte myelin and is involved in intracellular signalling in many CNS cell populations. It has been found to be increased in CSF in RRMS/CIS patients compared to healthy controls (33,34), and in the brain of animals with EAE (35). The reduction in the CSF observed here could, therefore, reflect the sum difference between the anabolic and catabolic processes in MS versus other neurological disease where a relative loss in MS is present. Interestingly, myo-inositol levels can be used to accurately discriminate between RRMS and antibody mediated-NMOSD (36), where the further reduction in myo-inositol in antibodymediated NMOSD may reflect demyelination and increased loss of astrocyte membrane. Hierarchical clustering revealed four highly correlated groups of proteins, but these did not fit with any known disease associated clusters. Pathway analysis revealed increased JAK-STAT signalling and glycolysis pathways in the MS cohort. The JAK-STAT pathway is activated by many cytokines, and its activation is key in almost all immune responses. The increase in glycolysis would be consistent with increased energy metabolism associated with inflammation.
For patients who are incorrectly diagnosed with MS, 50% have been found to carry the misdiagnosis for at least 3 years, and more than 5% were found to be misdiagnosed for over 20 years (37). This can lead to the administration of inappropriate and potentially harmful disease modifying therapies. Our results have shown how the inclusion of a small number of additional laboratory tests complement the high sensitivity of OCGB by increasing specificity and PPV and, thus, could significantly improve confidence in an MS diagnosis when DIT cannot be confirmed either radiologically or clinically. NMR metabolomics is a cheap and rapid analysis method, requiring minimal sample preparation and with the advantage that a host of additional small molecules can be quantified in the same analysis. Indeed, we have shown that NMR metabolomics is able to identify relapse (38), predict conversion (16), and diagnose progression (39) in MS and so, in future, it may be possible to apply multiple tests to a single sample. Future work  will develop a chip-based assay to measure only the top biomarkers identified here in the hopes of improving translatability of this method. However, as CSF is routinely collected as part of the 2017 McDonald criteria, the addition of a small number of biomarker measurements to complement and improve the specificity of the already measured OCGB would be of benefit in a clinical setting. One limitation of this study might be considered to be that we used a cohort from one site which was limited in size (n=105). It is clear that a further prospective study in a larger independent cohort, collected across multiple centres, of early MS/clinically isolated syndrome patients, focused on the principal metabolite and protein biomarkers identified here, is now warranted. Should the results be conserved in a larger and broad population, the use of these markers in addition to OCGB could provide a valuable new diagnostic test for the presence of MS.

DATA AVAILABILITY STATEMENT
The datasets presented in this article are not readily available because they contain identifiable information from human subjects that cannot be shared in open access repositories for legal reasons. Anonymized data will be shared upon request from any qualified investigator.
Requests should be directed to the corresponding authors.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by University Hospital Basel local ethics committee. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
FP was involved in the design/conception of the study, performed the NMR data acquisition, developed in house R scripts, performed and interpreted the analysis of the results, and drafted the manuscript. TY was involved in the design/conception of the study, preparation of samples for NMR data acquisition, interpretation of results, and contributed to writing the manuscript. YZ contributed to sample preparation and NMR data acquisition. MS contributed to sample preparation and NMR data acquisition. SA contributed to data analysis design and preparation of the manuscript. JP was involved in the design/ conception of the study and interpretation of results. TC was involved in acquisition and interpretation of NMR data. RH acquired the proteomics data. JO was involved in patient recruitment, clinical data acquisition, interpretation of results, and manuscript preparation. JK was involved in the design/conception of the study, was involved in clinical data acquisition, interpretation of results, and manuscript preparation. DL was involved in the design/conception of the study, was involved in clinical data acquisition, interpretation of results, and manuscript preparation. DA was involved in the design/conception of the study, was involved in clinical data acquisition, interpretation of results, and was a major contributor to writing the manuscript. All authors contributed to the article and approved the submitted version.