Identification of crucial inflammaging related risk factors in multiple sclerosis

Background Multiple sclerosis (MS) is an immune-mediated disease characterized by inflammatory demyelinating lesions in the central nervous system. Studies have shown that the inflammation is vital to both the onset and progression of MS, where aging plays a key role in it. However, the potential mechanisms on how aging-related inflammation (inflammaging) promotes MS have not been fully understood. Therefore, there is an urgent need to integrate the underlying mechanisms between inflammaging and MS, where meaningful prediction models are needed. Methods First, both aging and disease models were developed using machine learning methods, respectively. Then, an integrated inflammaging model was used to identify relative risk factors, by identifying essential “aging-inflammation-disease” triples. Finally, a series of bioinformatics analyses (including network analysis, enrichment analysis, sensitivity analysis, and pan-cancer analysis) were further used to explore the potential mechanisms between inflammaging and MS. Results A series of risk factors were identified, such as the protein homeostasis, cellular homeostasis, neurodevelopment and energy metabolism. The inflammaging indices were further validated in different cancer types. Therefore, various risk factors were integrated, and even both the theories of inflammaging and immunosenescence were further confirmed. Conclusion In conclusion, our study systematically investigated the potential relationships between inflammaging and MS through a series of computational approaches, and could present a novel thought for other aging-related diseases.


Introduction
Multiple sclerosis (MS) is an immune-mediated disease characterized by inflammatory demyelinating lesions in the central nervous system (CNS).MS is one of the major causes of disability (GBD 2016Neurology Collaborators, 2019), leading to a heavy burden on families and society (Wang et al., 2023).It has been estimated that the number of people with MS increased to 2.8 million globally in 2020, 30% greater than that in 2013 (Walton et al., 2020).Therefore, it is imperative to study the underlying risk factors of MS.
There is substantial (i.e., epidemiological, pathological and clinical) evidence indicating that chronological age is as the factor mostly vital to MS (Graves et al., 2023), and even the development of MS is closely related to aging (Graves et al., 2023).For example, telomere abrasion is associated with disability and brain atrophy in MS patients (Krysko et al., 2019), and reproductive aging might also affect MS progression (Graves et al., 2018).Moreover, aging microglia create a chronic inflammatory microenvironment declining the normal function of remyelination (Neumann et al., 2019), and aging astrocytes are vital to impair synaptic plasticity and disturb the neuronal metabolic homeostasis (Correale and Farez, 2015;Oost et al., 2018).In short, there is growing evidence that aging promotes the development of MS.
MS is a chronic inflammatory disease closely relate to the aging process (where aging-related inflammation is often defined as inflammaging) (Xia et al., 2016;Cantuti-Castelvetri et al., 2022).The main pathological hallmark of MS was demyelinating plaque, which was also accompanied by chronic inflammation (Howe et al., 2007;Lemus et al., 2018).It has been reported that aging promoted neuroinflammation in MS and even led to a diminished ability of microglia responding to axonal deficits (Mestre et al., 2021).Moreover, senescent microglia were characterized by reduced migration and phagocytosis abilities, indicating that they were less efficient in removing myelin debris from damaged neurons in MS (Neumann et al., 2009).Furthermore, ongoing neuroinflammation was associated with neuronal death, which was vital to injure the neuronal health (Simkins et al., 2021).During inflammatory CNS episodes, several types of neurotoxic oxidation products were synthetized, thus leading to increased energy demands (Mahad et al., 2009;Haider et al., 2011).In conclusion, it could be speculated that the aging-related inflammation (inflammaging) was as one of the major risk factors in MS that needed to be explored more systematically.
Fortunately, with the development of artificial intelligence, many of the researches on MS utilized machine learning (ML), which allowed for the diagnosis and prognosis using real datasets (Aslam et al., 2022).In addition, ML techniques offered new insights in the diagnosis, characterization and prediction of disease progression (Jasperse and Barkhof, 2023).Several studies have shown that ML can recognize key markers associated with inflammation and aging (Mezzaroba et al., 2020;Zhou et al., 2023).Meanwhile, there was an urgent need to integrate key biomarkers and biological information (e.g., by mendelian randomization, MR) (Yuan et al., 2021;Li C. et al., 2022).In addition, gene co-expression network analysis could also identify highly correlated gene clusters and explore their potential molecular mechanisms in MS (Creanza et al., 2016;Gu et al., 2022).However, despite a large number of studies explaining the risk factors in MS, potential mechanisms of in MS based on inflammaging were still unclear and thus needed to be further explored at the system level.
To further explore the potential mechanisms involved in aging, inflammation and MS, a series of computational methods were integrated in this work (Figure 1): (1) Machine learning was used to identify aging and disease (MS) markers, respectively; (2) An integrated inflammaging model was used to explore the key relationship between inflammaging and MS, by identifying essential "aging-inflammation-disease" triples; (3) Network analysis, sensitivity analysis and enrichment analysis were used to study potential risk factors for multiple sclerosis; (4) Pan-cancer analysis was used to further validate relative biological functions in cancers based on "aging-inflammation-disease" triples.Ultimately, a series of underlying mechanisms of MS (i.e., protein homeostasis, cellular homeostasis, neurodevelopment, and energy metabolism) were integrated, which also provided key indicators for cancer.These results could also present a novel thought for other aging-relative experimental validations.

Results
. Modeling prediction models and identifying relative biomarkers Gene expression profiles were obtained from the GEO database, including 445 samples (Supplementary Table S1) and 16,275 genes (Supplementary Table S2).These genes were ranked by the ReliefF algorithm, and then the aging predictor and disease predictor were modeled using the k-nearest neighbors (kNN; k = 9 with the correlation distance) algorithm and optimized by 10-fold crossvalidation.The learning curves for the aging and disease models in the training dataset were shown in Figures 2A, B, where the models with the highest accuracy were selected (Table 1), including 70 aging markers and 19 disease markers (Supplementary Tables S3,  S4).As a result, the accuracies of the aging model and the disease model in the test set were 0.8390 and 0.7233, respectively (Table 1).Furthermore, the areas under the curve (AUCs) for the aging and disease models in the test were 0.73672 and 0.64063 (Figures 2C,D), by summarizing the specificity (the accuracy for the normal old samples in the aging model, or for the MS samples in the disease model) and the sensitivity (the accuracy for the normal young samples in the aging model, or for the control samples in the disease model) based on the ReliefF ranking results (i.e., the first one gene expression, the first two genes, or the first three genes,..., the first 100 genes), respectively.In addition, the AUC of the ROC curves were 0.74382 and 0.84711 based on the aging and disease score, respectively (Figures 2E, F).Consequently, these results indicated that our predictors were with enough accuracies in both aging and disease models.
Both aging and disease markers indicated important biological functions.For example, TSPAN6 (tetraspanin 6, ReliefF weight was 0.135) was the top aging marker.It had been reported that TSPAN6 was as a novel regulator of APP-CTF protein homeostasis that prevented APP-CTF degradation from the impairment in autolysosomal pathway (Guix et al., 2017).In addition, TSPAN6 was identified as a regulator of synaptic transmission and plasticity mechanisms, playing a key role in synaptic development and AMPAR transport (Salas et al., 2017;Becic et al., 2022).POM121L9P (POM121 transmembrane nucleoporin like 9 as a pseudogene, ReliefF weight 0.051) was the top disease marker.FNDC4 (Fibronectin type III structural domain-containing protein 4, ReliefF weight was 0.050) was the second top disease marker, and it have been shown to induce the AMP-activated protein kinase (AMPK) phosphorylation and

FIGURE
The workflow of our study.
heme oxygenase-1 (HO-1) expression in adipocytes, which in turn suppressed inflammation and endoplasmic reticulum stress (Lee et al., 2018).In short, these results indicated potential mechanisms (i.e., neurodevelopment and energy metabolism) between aging and MS.

. Identifying essential relationships in MS by the integrated inflammaging model
An integrated inflammaging model was developed to explore the important relationships among aging, inflammation and MS (shown in Section 5.3).Table 2 (Chen and D'Mello, 2010;Wan, 2014;Maridas et al., 2017;Jatczak-Pawlik et al., 2020;Tong et al., 2020;Black et al., 2021;Correale, 2021;Fadul et al., 2021;Sehgal et al., 2021;Atiyah et al., 2023) demonstrated the top ten aging, inflammatory and disease markers, respectively.For example, BMP8A (bone morphogenetic protein 8a), the top aging marker, can achieve anti-adiposity by promoting fatty acid oxidation and inhibiting adipocyte differentiation (Zhong et al., 2023).Here, FNDC4 was also identified as the top disease marker in triples.IDO1 (indoleamine 2,3dioxygenase 1) was the top inflammatory marker as a key determinant enzyme in the metabolism of L-tryptophan (Trp), shifting the process from serotonin production to kynurenine production (Correale, 2021).The roles of the kynurenine pathway included endogenous regulation of neuronal excitability, initiation of immune tolerance and synthesis of nicotinamide adenine dinucleotide (NAD), where NAD+ being a key molecule in a variety of biochemical processes (Mbongue et al., 2015;Zhong et al., 2023).In summary, the integrated inflammaging model revealed important relationships among aging, inflammation and MS.
The top sensitive (with occurring times) aging, inflammatory and disease markers were also shown in

The accuracy of training datasets
The accuracy of test datasets

. Underlying inflammaging mechanisms by enrichment analysis
To further explore potential underlying mechanisms between inflammaging and MS, each shortest path between inflammatory and disease markers was obtained based on the Dijkstra algorithm, then the enrichment analysis was performed based on Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and Biological Process (BP) terms in Gene Ontology (GO).Because the  (Lasky, 1991;Reichardt, 2006;Fujita et al., 2009;O'Callaghan et al., 2017;Conway, 2018;Gao et al., 2018;Kotelnikova et al., 2019;Plantone et al., 2019;Ten Bosch et al., 2021;Bahadoram et al., 2022;Bohmwald et al., 2022;Sen et al., 2022;Wu and Zhou, 2022;Chen et al., 2023;Liu et al., 2023;Touil et al., 2023;Guerra-Espinosa et al., 2024) and Figure 3 and Supplementary Figure S1 (without the starting point).For example, the KEGG pathway that was most enriched shortest paths was "B CELL RECEPTOR SIGNALING PATHWAY" (enriched 50 shortest paths, Figure 3A and Supplementary Figure S1A).BCR was critical for B cells to properly elicit an immune response (Tanaka and Baba, 2020).The KEGG pathway with minimum FDR was "ALDOSTERONE REGULATED SODIUM REABSORPTION" (FDR = 0.0902, Figure 3C and Supplementary Figure S1C), where sodium reabsorption occurred in the kidney (Franken et al., 2021).Sodium accumulation might play a critical role in both inflammatory and neurodegenerative processes in MS patients (Zostawa et al., 2017;Huhn et al., 2019).Furthermore, it has been shown that sodium accumulation leads to the release of calcium, which can exacerbate neurological disorders (Yang et al., 2015).
In summary, the enrichment analysis revealed various risk factors in MS, such as inflammation, neurodevelopment and cellular homeostasis.

. Network markers identified potential risk markers
Network markers were identified by calculating the betweeness of the shortest path for each "inflammation-disease" pair, where the top 10 markers were shown in Table 7.For example, the top network marker (with the maximum betweeness and significant permutation result) was TARDBP (Transactive response DNA binding protein), where the relative network modules (including all the related shortest paths) were shown in Figure 4.It has been reported that TARDBP encoded the intranuclear protein TDP-43 (Transactive response DNA binding protein of 43 kDa) that played a role in the cellular stress response (Higashi et al., 2013).Stress granules were cytoplasmic foci that respond to cellular stress, and TDP-43 bound to ribosomes in stress granules, temporarily halting translation and promoting cytoprotective protein synthesis (Higashi et al., 2013;Baradaran-Heravi et al., 2020;Meneses et al., 2021).In addition, TARDBP was a risk factor for amyotrophic lateral sclerosis, frontotemporal dementia, and Alzheimer's disease, exacerbating cognitive impairment (Manohar et al., 2009;Meneses et al., 2021).In short, potential crucial risk markers were further identified by network analysis.
. Pan-cancer analysis further validated the mechanism of inflammaging in MS Pan-cancer analysis was used to further validate the relevant functions of "aging-inflammation-disease" triples.For example, the markers in triples were used to assess the survival indices across different cancer types.There were 9 out of 16 cancer types with significant results (including BLCA, HNSC, KIRC, KIRP, LIHC, LUAD, LUSC, READ and UCEC, shown in Figure 5).These results suggested that inflammaging markers could also be used as relative risk factors for cancer.In addition, based on enrichment analysis, both the commonality and specificity across 16 cancer types were further investigated.The top 10 common KEGG pathways were shown in Figure 6A, with the highest enrichment score of "CALCIUM SIGNALING PATHWAY."A growing body of research suggested that calcium homeostasis contributed to well-known cancercausing signals.Many studies had emphasized that calcium signaling contributed to the progression of several cancer types (e.g., glioma, prostate, and breast) through the activation of STAT3 (Wu et al., 2021).Meanwhile, calcium channels played an important role in the excitation and propagation of neuronal action potentials (Pourtavakoli and Ghafouri-Fard, 2022).The top 10 common BP terms were shown in Figure 6B, with the highest enrichment scoring of "NERVOUS SYSTEM PROCESS" (GO:0050877), indicating the key role of the nervous system in cancer (Hanahan and Monje, 2023).Supplementary Table S9 (Peterson et al., 2020;Feng et al., 2021;Glorieux and Buc Calderon, 2021;Hou et al., 2021;Naghshi et al., 2021) and Supplementary Table S10 (Martens and Mithöfer, 2005;Turner and Grose, 2010;Menezes et al., 2018;Arneson and Doles, 2019;Przygodzka et al., 2019;Keough and Monje, 2022;Ohkuni et al., 2022;Libretti and Aeddula, 2023;Lustberg et al., 2023) summarized the specific enrichment results in each cancer, indicating that a series of risk factors (such as neurodevelopment and cellular homeostasis) were also crucial to cancer.
In summary, our findings highlighted a series of key functions associated with inflammaging that could also be used to investigate potential mechanisms in cancer.

Discussion
The inflammatory response plays a crucial role in MS.However, the important relationships among aging, inflammation and MS remain to be further explored in depth.In this paper, a series of computational methods were used to explore these relationships and relative mechanisms in MS.First, both aging and disease predictors were modeled to identify relative aging and disease markers, respectively.Then, an integrated inflammaging model was developed to find important "aging-inflammation-disease" triples.Further, the potential mechanisms between inflammation and MS were investigated using network analysis, sensitivity analysis, enrichment analysis and pan-cancer analysis.In short, various risk factors in MS were integrated at system level.
Our findings emphasized that protein homeostasis was vital to the MS development.For example, the disease marker FNDC4, could lead to AMP-activated protein kinase (AMPK) phosphorylation (Lee et al., 2018).Moreover, in the MCMC (Table 3), the most sensitive marker of aging was TMPRSS13, playing a key role in proteolytic activity and phosphorylation (Martin et al., 2021).In the enrichment analysis, the KEGG pathway was "COMPLEMENT AND COAGULATION CASCADES, " which was associated with protein catabolism (Conway, 2018).In summary, our results also confirmed that the protein homeostasis played an important role in MS by interacting with the immune system, even accelerating the progression of MS (Negrotto and Correale, 2017).
The cellular homeostasis was also highlighted in this work.For example, the top network marker was TARDBP (Figure 4), which encoded the intranuclear protein TDP-43 that played a role in the cellular stress response (Higashi et al., 2013).Moreover, in MCMC (Table 3), the inflammatory marker CUEDC2 was involved in the cell cycle regulation (Xiao et al., 2019).According to the enrichment analysis, one of the most enriched KEGG pathways was "MAPK SIGNALING PATHWAY."The MAPK pathway is associated with cell proliferation, differentiation, migration, senescence, and apoptosis (Sun et al., 2015).Furthermore, lytic cell death pathways (such as pyroptosis, necroptosis, ferroptosis, and PANoptosis) are closely related to neuroinflammation and even exacerbate MS (Lee et al., 2023).This demonstrated the key role of cellular homeostasis in MS.
Neurodevelopment played an important role in the development of MS.For example, the top aging marker TSPAN6 was a regulator of synaptic transmission and plasticity mechanisms (Salas et al., 2017).The inflammaging model had identified a series of inflammatory markers associated with neuronal formation in MS (Table 2): DO1 had a role in regulating neuronal excitability (Correale, 2021); SLC18A2 was neuroprotective and PNMA1 promoted neuronal apoptosis (Chen and D'Mello, 2010;Black et al., 2021).In the enrichment analysis, the KEGG pathways "NEUROTROPHIN SIGNALING PATHWAY" and "NEUROACTIVE LIGAND RECEPTOR INTERACTION" also highlighted the neurodevelopment (Reichardt, 2006;Bohmwald et al., 2022).MS is a wellknown neuroinflammatory disease, where neuronal damage is vital to the progression of MS lesions (Schirmer et al., 2019).In sum, our results highlighted the neurodevelopment in MS.
The energy metabolism was also involved in the development of MS.For example, in the integrated inflammaging model (Table 2), the aging marker BMP8A, enabled anti-adiposity by promoting fatty acid oxidation and inhibiting adipocyte differentiation (Zhong et al., 2023); the inflammatory marker IGFBP4 was an important regulator of adipose tissue development (Maridas et al., 2017).In addition, in MCMC (Table 4), the inflammation marker SHPK catalyzed the phosphorylation of sedoheptulose in the non-oxidative arm of the pentose phosphate pathway (Franceschi et al., 2022).Furthermore, in the enrichment analysis (Supplementary Table S6), the KEGG pathway with minimum FDR was "NICOTINATE AND NICOTINAMIDE METABOLISM" (FDR = 0.001626), which produced the biologically active coenzymes NAD (its phosphate analog was the NADP) (Gasperi et al., 2019).There was also a series of pathogenesis in MS along with the energy failure of the CNS (Park and Choi, 2020).These results suggested that energy metabolism was closely related to the MS progression.
It has been well known that MS is a chronic inflammatory disease, which is closely related to the aging process.Herein various risk factors for MS were explored based on aging-related inflammation (inflammaging).In this work, a series of computational methods were used to investigate potential molecular mechanisms in MS.An inflammaging model was constructed to obtain the "aging-inflammation-disease" triples, and then crucial inflammaging characteristics in MS were identified.In addition, these results could also indicate further the relative experimental validations.In short, the complex mechanisms in MS could be further studied by exploring key inflammaging indices, where various risk factors were integrated at system level.
Stridently, the identified inflammaging characteristics in MS (i.e., the inflammaging markers, enriched KEGG pathways or BP terms, shown in Tables 2-6) have been validated by a series of relative experiment results.For example, inflammaging could alter the transport capacity of B cells, making them more sensitive to cytokines and pro-inflammatory molecules, which were overproduced in the elderly (Bulati et al., 2014).Recently, using the flow cytometry, it has been demonstrated that the combination of pro-inflammatory interleukin-21 (IL-21) and B-cell receptor (BCR) stimulation enabled B cells to produce/secrete the active form of the cytotoxic serine protease granzyme B (GrB), which might exacerbate the MS progression (Niland et al., 2010;Bulati et al., 2014).Further, the coagulation pathway was also identified in this work, and even confirmed by other experimental results of MS (by using animal models, single-cell RNA sequencing, or flow cytometry).It has been reported that the coagulation cascade increased neuroinflammation during the aging process, thus interacting with a series of physiological factors such as neuronal deficits, oxidation, or dysfunction of the endoplasmic reticulum and mitochondria, which in turn contributed to the onset of MS (Conway, 2018;Plantone et al., 2019).In addition, Enzymelinked immunosorbent assay (ELISA) indicated that "Bovine serum albumin (BSA)-advanced glycation end (AGE)" enhanced IL-6 expression through MAPK-ERK action (MAPK-ERK and MyD88 transduced NF-κB signaling pathways), and studies (both in vitro and in vivo) have demonstrated that IL-6 played a crucial role in regulating the immune response in MS (Janssens et al., 2015;Shen et al., 2019).The EBV infection have also been reported to increase the risk of developing MS approximately 32-fold (Bjornevik et al., 2022).EBV infected of B cells and T cells, leading to infected B cells infiltrated of the CNS and T cell exhaustion, where CD8 T cell deficiency contributed to the decreased CD8 T cell response to EBV-infected B cells and with functional declined in aged MS patients (Pender et al., 2012;Soldan and Lieberman, 2023).Note worthily, the perpetuation of "forbidden" autoreactive B-cell clone by EBV immortalization have been suggested as a potential mechanism for triggering MS (Pender, 2011).For example, in the context of inflammaging and immunosenescence, EB-virus immortalized B lymphocytes model have been shown to produce higher levels of IL-6, which was associated with the pathogenesis of MS (Olivieri et al., 2003;Janssens et al., 2015).In short, a series of key risk factors in MS were identified based on inflammaging, and even could be confirmed by relative experiments.
Studies had shown that the risk of cancer was increased in people with MS (Ragonese et al., 2017;Bosco-Lévy et al., 2022) The top Network marker with the maximum betweeness.(A) Before sensitivity analysis; (B) after sensitivity analysis.
Additionally, the key roles of inflammaging markers in different cancer types were further confirmed by survival analysis.Inflammation was increasingly recognized as an important factor impairing normal functions in CNS, which in turn affected both cancer and MS (Deverman and Patterson, 2009;Jiang et al., 2018).In addition, chronic inflammation disrupted the cellular homeostasis, which played an important role in the development of both MS and cancer (Kotas and Medzhitov, 2015).In short, various risk factors associated with inflammaging had also been demonstrated in cancer.
As with other research articles on MS (Denissen et al., 2021;Aslam et al., 2022) and other neurodegenerative diseases [e.g., Alzheimer's disease (Chang et al., 2021;Li J. et al., 2022) and Parkinson's disease (Boutet et al., 2021;Oliveira et al., 2023)], machine learning was utilized to build high accuracy models or predictive biomarkers, which were then subjected to enrichment analysis network analysis and so on.In addition, our study identified an integrative model based on machine learning to further explore the underlying mechanisms of MS in the context of inflammaging.As a result, a series of relative key risk factors were summarized at system level, and even validated across different cancer types.These results indicated that our results were with enough reliability and accuracy.
According to the inflammaging theory, the chronic inflammation was accumulated during the aging process, along with a series of dysregulated pathways (Fang et al., 2018).In addition, the immunosenescence was also accompanied with a series of molecular dysfunctions in both innate and adaptive immune systems, and even interacted with aging (Rodrigues et al., 2021;Liu et al., 2023).Both inflammation and aging were wellknown to affect microglia and astrocytes, which in turn impaired normal neurons (Neumann et al., 2019;Kwon and Koh, 2020;Diaz-Castro et al., 2021).Inflammation also affects the protein metabolism and cellular homeostasis (Antonangeli et al., 2021;Cibrian et al., 2022).In addition, these risk factors interplayed with each other to promote the development of MS.For example, dysregulations in cellular homeostasis can interact with the protein homeostasis, energy metabolism, etc., which in turn aggravated the MS progression (Huang et al., 2022).With the help of the integrated inflammaging model, our study highlighted a series of risk factors closely related to inflammaging in MS, such as protein homeostasis, cellular homeostasis, neurodevelopment and energy metabolism.These results also further confirmed both the theories of inflammaging and immunosenescence (Figure 7).In short, we integrated the potential mechanisms of MS in the context of inflammaging (Figure 7).where our work presented a novel thought to study relative molecular mechanisms.

Conclusion
In this study, machine learning was used to construct models for predicting aging and disease (MS) and to identify relative biomarkers.The important relationship between inflammaging and MS was further explored by building the integrated inflammaging model.Relative inflammaging characteristics in MS patients were investigated holistically through sensitivity, enrichment, network and pan-cancer analyses.In summary, our study integrated protein homeostasis, cellular homeostasis, neurodevelopment and energy metabolism as risk factors in MS based on inflammaging indices, also presenting a novel thought to other aging-related diseases.
The gene expression profiles were processed as follows: (1) Only the samples with both the age and phenotype indices (MS or control) were retained; otherwise, they were excluded.(2) The gene expression matrix for each dataset was integrated by summarizing the probe number within the gene symbol.(3) The total data matrix was integrated, and the missing gene expression values were filled with values of 0. (4) Data processing was performed on the summary matrix to remove genes with ≥30% missing values.As a result, a total of 445 samples were obtained (Supplementary Table S1), including 66 samples of healthy aged people (aged ≥ 50 years, 45 training datasets +21 test datasets), 118 samples of healthy young people (aged <50, 80 + 38), 94 samples of MS aged people (aged ≥ 50, 65 + 29) and 167 samples of MS young people (aged < 50, 115 + 52), containing 16,275 gene symbols (Supplementary Table S2).Further, comparison results based on the inter-sample normalization step (i.e., z-score, SVD, and another z-score) have been show in Supplementary Figure S2, including boxplots and scatter plots.These results indicated that the normalization could treat relative profiles from different platforms with enough efficiencies, indicating that the profiles were clustered with each dataset before the normalization, comparable after the normalization, and even distinguishable between different phenotypes (i.e., MS or control) if combining with machine learning methods.
We also obtained paired gene expression (RNAseq) profiles ("Batch effects normalized mRNA data") and clinical data from the TCGA database through the xena platform (https://xenabrowser. net/hub/).Cancer types with ≥10 adjacent normal samples were retained.There were 16 cancer types included in this study: BLCA .

Modeling the aging model and disease model
The ReliefF algorithm was used to select key features, and then the first 100 models were studied to train the predictors.The optimal model was selected by 10-fold cross-validation.To verify the accuracy of the aging predictor, the selected model was verified in the test dataset.
(1) In the aging model, the normal aged group (age ≥ 50) was labeled 1, and the young healthy group (age < 50) was labeled 0; in the disease model, the MS group was labeled 1, and the control group was labeled 0.
(2) The ReliefF algorithm was used to sort 16,275 genes for the aging and disease models; (3) The predictor of the model was used to select the key markers with the help of the k-nearest neighbors (kNN, k = 9, correlation) algorithm.The model with the highest accuracy was also selected with the help of 10-fold cross-validation, where the identified features were considered aging markers and disease markers.
As a result, the optimal k-nearest neighbor (kNN, k = 9, correlation) algorithm was used, and a total of 70 aging markers and 19 disease markers were identified.In addition, these markers could be summarized as the aging score and disease score for further analyses (i.e., comparison in Section 5.1 or sensitive analysis in Section 5.4). .Identifying essential relationships in MS by integrating inflammaging models An integrated inflammaging model was built to identify the essential relationships among aging, inflammation and MS.The computational pipeline used was MR, although it was not as rigorous as MR (Burgess et al., 2020(Burgess et al., , 2023)).
In this model, the aging-related inflammatory markers were considered inflammaging markers, where candidate aging/disease markers were identified in Section 5.2 to be further select relate to these inflammaging markers.Ultimately, the essential relationships among aging, inflammation, and disease (MS) markers were identified as key "aging-inflammation-disease" triples in MS.
Here, the aging markers were used as the auxiliary variables (similar to the instrumental variables in MR), and the inflammatory markers were used as the candidate risk factors.Then, inflammatory ("inflammaging") markers were identified as the risk factors, and disease markers were used as the outcome variables.That is, the integrated inflammaging model aimed to explore the essential relationships among aging, inflammation and disease markers in MS.The objectives of the "aging-inflammation-disease" triples were as follows: (1) There was a correlation between the aging marker and the inflammatory marker.(2) There was a correlation between the inflammatory marker and the disease (MS) marker.(3) There was a correlation between the aging marker and the disease (MS) marker.(4) There was a strong correlation between the aging marker and the disease marker, if through the inflammatory marker.
The methodological steps of the model were as follows: (1) where the phenotype could be defined as 1 (MS) or 0 (control).Furthermore, both a p-value < 0.05 and a Benjamini-Hochberg false discovery rate (FDR) < 0.1 were used to select strongly correlated aging markers.
(3) The correlation (differential co-expression) was used to select inflammatory markers that strongly correlated with disease markers with the help of the Kruskal-Wallis test.Here, the differential co-expression was calculated as follows: where the phenotype could be defined as 1 (MS) or 0 (control).Furthermore, both a p-value < 0.05 and a Benjamini-Hochberg false discovery rate (FDR) < 0.1 were used to select strongly correlated inflammatory markers.
(4) The correlation (differential co-expression) was used to select disease markers that strongly correlated with aging markers with the help of the Kruskal-Wallis test.Here, the differential co-expression was calculated as follows: where the phenotype could be defined as 1 (MS) or 0 (control).Furthermore, both a p-value < 0.05 and a Benjamini-Hochberg false discovery rate (FDR) < 0.1 were used to select strongly correlated disease markers.
(5) To filter out the effect of horizontal pleiotropy, the agingdisease relationship was further examined by comparing the correlation between each aging marker and disease marker through the inflammatory marker or otherwise.Here, steps x-z were used to calculate the correlations between auxiliary variables and outcome variables without the background of the risk factor, and step { was used to calculate the correlations between auxiliary variables and outcome variables with the context of the risk factor.
x The residual of each disease marker ("residual A") was calculated based on the inflammatory marker: where b1 was the regression coefficient.
y The residual of each aging marker ("residual B") was calculated based on the inflammatory marker: residual_B = aging_marker − b 2 * inflammation_marker (7) where b2 was the regression coefficient.
z The abovementioned two residuals were further compared, and the residual of the disease marker was calculated (as "residual C"): where b3 was the regression coefficient.{ The residual disease marker ("residual D") was calculated based on the aging marker.
| The difference (between "residual C" and "residual D") between the MS and control subgroups was tested using the Kruskal-Wallis test (P < 0.05 and FDR < 0.1).
Finally, the essential relationships among aging markers, inflammatory markers and disease markers were determined.Thus, 5,599 "aging-inflammation-disease" triplets were identified, including 65 aging markers, 107 inflammatory markers (as the 107 inflammaging markers) and 19 disease markers.Thus, these 107 inflammatory markers were used as inflammaging markers (risk factors), and 19 disease markers were also used to discriminate the MS phenotype.
In addition, the whole differential co-expression pattern among aging, inflammation and disease markers could be calculated based on these triples.
where i and j was the i-th inflammation marker and the jth aging marker, corr was the Pearson's correlation coefficient, and the phenotype could be defined as 1 (MS) or 0 (control).The differential co-expression of a inflammation marker was summarized based the related aging markers in the triples.
where k and i was the k-th disease marker and the i-th inflammation marker, corr was the Pearson's correlation coefficient, and the phenotype could be defined as 1 (MS) or 0 (control).The differential co-expression of a disease marker was summarized based the related inflammation markers in the triples. .

Sensitivity analysis using the MCMC method
To further explore crucial relationships among aging, inflammation and MS, sensitivity analysis was performed based on the Markov Chain Monte Carlo (MCMC) method, where "aging-inflammation-disease" triples were further evaluated as a candidate relationship.The MCMC method was used to sample certain posterior distributions in a high-dimensional space based on a given probabilistic background.The key step of MCMC was to construct a Markov chain whose equilibrium distribution was equal to the target probability distribution.The steps were as follows: (1) Constructing the transfer cores of the ergodic Markov chain.The prior distribution of each parameter was normally distributed based on all identified markers in each group (i.e., MS or control), respectively.(2) Simulate the chains until equilibrium was reached.The Metropolis-Hastings sampling method was used to determine whether the new sample (θ * ) was acceptable based on the α value.
where P (θ n | X) and P (θ * | X) were the posterior probability of the nth accepted sample, the new sample q (θ n → θ * ) was the transition probability from the nth accepted sample to the new sample, and q (θ * → θ n ) was the transition probability from the new sample to the n-th accepted sample.
In this work, the disease score was used to evaluate the simulated samples, with 1,000 random samples used as candidate samples for each group (i.e., MS or control).The disease score was calculated by comparing the distance between normal and MS training samples based on the 19 disease markers identified by the integrated inflammaging model, by using the Equation (2).
(3) Performing the global sensitivity analysis The correlation index was used to evaluate each of the "aginginflammation-disease" triples in the accepted samples (both MS and control): correlation_index = disease_marker − aging_marker inflammation_marker − aging_marker (12) Therefore, the correlation indices were calculated for each "aging-inflammation-disease" triple for all accepted samples.Then, the Kruskal-Wallis test was used to evaluate each correlation index in each "aging-inflammation-disease" triple, where p-value < 0.05 and FDR < 0.1 were set as the threshold.Finally, a total of 35 "aging-inflammationdisease" triples were identified as sensitive relationships, including 16 aging markers, 28 inflammatory markers, and 9 disease markers. .

Constructing the di erential co-expression network
To further reveal potential mechanisms between "inflammaging" and MS, a differential co-expression network was constructed via the following steps: (1) The Pearson correlation coefficient for each pair of genes was calculated based on the MS and control groups.
(2) The Benjamini-Hochberg FDR method was used to adjust the p-values of the correlation coefficients.
(3) The relationship between each gene pair was retained if the coefficient value in MS had the opposite sign (i.e., + or -) to that in the control, as well as if p < 0.05 and FDR < 0.1.(4) The shortest path between each pair of inflammaging and disease markers was selected based on the differential coexpression network using the Dijkstra algorithm. .

Enrichment analysis
The gene functions were further explored by enrichment analysis of the shortest pathway.Gene Ontology (GO) terms and KEGG pathways for the GSEA platform were obtained from gene set enrichment analysis (http://software.broadinstitute.org/gsea/downloads.jsp,version 2023.1).The hypergeometric distribution was used to test the degree of enrichment of the GO BP and KEGG pathways.Hypergeometric test formula: where N was the total number of genes in the gene set, M was the number of known genes (such as the KEGG pathway or BP terms), which was the number of genes identified in each shortest pathway, and k was the number of common genes between known genes and candidate genes identified in each "inflammation-disease" shortest pathway.The p-value of each path was controlled using the Benjamin-Hochberg method.Finally, pathways with p < 0.05 and FDR < 0.1 were retained.

. Identifying network markers
The subnetwork with the shortest pathways among the selected "inflammation-disease" pairs was constructed, and genes in the subnetwork were sorted by their betweennesses in descending order.To test whether the top betweenness genes were hubs in the background network, we ran a permutation to count the occurrence time of the top genes in the shortest paths between randomly selected genes (containing the same numbers of "inflammationdisease" triples, based on the identified "aging-inflammationdisease" triples) when they had greater betweennesses than those in our study.We repeated this process 1,000 times, and the p-value was calculated as the proportion of occurrence times of the top betweenness genes in 1,000 permutations.

. Pan-cancer analysis
The survival analysis was performed based on the inflammaging markers (identified by the integrated inflammaging model in Section 5.3) for each cancer using the Kaplan-Meier method.The first principal component of the triples set of key markers for each cancer was taken, and then they were categorized into two groups based on the mean values.Then, the Kaplan-Meier method was used to evaluate the survival difference between these two groups, and the significance was estimated by the log-rank test.A p-value < 0.05 was considered statistically significant.
Then, the differential expression networks were constructed for each cancer, where the details were also the same as Section 5.5.As a result, each shortest pathway was selected from each pair of inflammaging markers and differentially expressed genes (as disease markers in cancer) using the Dijkstra algorithm.Furthermore, the enrichment analysis was performed in each cancer (p < 0.05 and FDR < 0.1).

FIGUREFrontiers
FIGURE Machine learning results.(A, B) Learning curve for the training dataset; (C, D) sensitivity and specificity (similar to receiver operating characteristic) curves; (E, F) the ROC curve for the test dataset; (A, C, E) the aging model; (B, D, F) the disease (MS) model.

FIGURE
FIGUREEnrichment analysis of the shortest paths of KEGG and BP, after combining overlap shortest paths.(A) KEGG with the most shortest enriched paths; (B) BP with the most shortest enriched paths; (C) KEGG with the minimum FDR; (D) BP with the minimum FDR.The orange nodes represent the inflammaging markers, the blue nodes represent the genes connecting inflammaging markers and disease markers, the green nodes represent the disease markers, and the genes in the red square frames coincide with those genes in the enriched functions.
Despite the exploration of the underlying mechanisms in MS based on the inflammaging, there were still shortcomings as follows: (1) This paper only used 445 samples of microarray profiles, where the single-cell profiles should investigated in further analysis; (2) The biological experiments were still vital to performed to further validate relative conclusions in human cell line, if with proper permissions; (3) The potential mechanisms of MS were identified only based on inflammaging, without considering other key risk factors (e.g., oxidative stress or neuroendocrine).After all, a series of investigations were still needed to further explore underlying mechanisms in MS (or other neuroinflammatory diseases),

FIGURE
FIGURE Enrichment analysis shared by cancers.(A) KEGG pathways enriched in cancers; (B) BP terms enriched in cancers.

FIGURE
FIGURESummarized mechanisms in multiple sclerosis in the context of inflammaging.Rectangular genes represent aging markers, diamond genes represent disease markers, rhombic genes represent disease markers, and oval genes represent high median network markers.Orange arrows indicate that the gene was associated with neurodevelopment, green arrows indicate that the gene was associated with energy metabolism, blue arrows indicate that the gene was associated with protein homeostasis, pink arrows indicate that the gene was associated with cellular homeostasis, and red arrows indicate that the gene was associated with inflammation.
(5) The gene expression matrix was logarithmically transformed if it contained outliers.(6) Based on the mean and standard deviation of gene expression for the control individuals, z-score normalization was performed for both the MS and control samples.(7) The singular value decomposition (SVD) method was performed to eliminate the inter sample variation based on the top three principal components of the control samples.(8) The z-score was then utilized to normalize all samples based on the mean and the standard deviation of the control samples.(9) The gene expression profiles were further transformed using the hyperbolic tangent (Tanh) method, so that it takes values between −1 and 1. (10) The training set and the test set were randomly divided at a ratio of approximately 2:1.
, the ROC curve could be designed based on the aging and disease score, respectively.

Table 4 (
Sarasin-Filipowicz TABLE The accuracy of the aging predictor model and disease predictor model.
TABLE The top ten aging, inflammatory, and disease markers from the integrated inflammaging model.
point in each shortest path was a inflammatory marker, the enriched functions was analyzed by deleting the starting point, or analyzed other enriched functions excluding inflammation related function if containing the starting point.The top ten KEGG pathways were shown in Table5 . TABLE Sensitivity analysis of the top genes related to aging, disease, and inflammation.starting . It was well known that MS and cancer shared a