The oxidative aging model integrated various risk factors in type 2 diabetes mellitus at system level

Background Type 2 diabetes mellitus (T2DM) is a chronic endocrine metabolic disease caused by insulin dysregulation. Studies have shown that aging-related oxidative stress (as “oxidative aging”) play a critical role in the onset and progression of T2DM, by leading to an energy metabolism imbalance. However, the precise mechanisms through which oxidative aging lead to T2DM are yet to be fully comprehended. Thus, it is urgent to integrate the underlying mechanisms between oxidative aging and T2DM, where meaningful prediction models based on relative profiles are needed. Methods First, machine learning was used to build the aging model and disease model. Next, an integrated oxidative aging model was employed to identify crucial oxidative aging risk factors. Finally, a series of bioinformatic analyses (including network, enrichment, sensitivity, and pan-cancer analyses) were used to explore potential mechanisms underlying oxidative aging and T2DM. Results The study revealed a close relationship between oxidative aging and T2DM. Our results indicate that nutritional metabolism, inflammation response, mitochondrial function, and protein homeostasis are key factors involved in the interplay between oxidative aging and T2DM, even indicating key indices across different cancer types. Therefore, various risk factors in T2DM were integrated, and the theories of oxi-inflamm-aging and cellular senescence were also confirmed. Conclusion In sum, our study successfully integrated the underlying mechanisms linking oxidative aging and T2DM through a series of computational methodologies.


Introduction
Type 2 diabetes mellitus (T2DM) is a chronic endocrine metabolic disease caused mostly by insulin dysfunction.The increasing prevalence of diabetes has resulted in a great economic burden in many countries (1).According to statistics, there are approximately 536.6 million people with diabetes worldwide, and this number is expected to rise to approximately 783.2 million in 2045, with T2DM accounting for approximately 90% (1,2).Therefore, it is imperative to study the etiology of T2DM in depth.
Various reports have shown that T2DM is closely related to aging, with aging being one of the most vital risk factors for T2DM (3,4).Adipose tissue (AT) is redistributed during aging, which affects the sensitivity of insulin (5).Furthermore, the normal function of pancreatic beta cells also declines (3), and aging causes inflammation and low nutritional status, affecting the endocrine system (6).Additionally, a series of risk factors for T2DM are vital to other age-related diseases, such as Alzheimer's disease (AD), cardiovascular disease (CVD), and cancer (7)(8)(9)(10).
During the aging process, oxidative stress accumulates, leading to an energy imbalance that is key to T2DM (11,12).For example, oxidative intermediates can damage pancreatic beta cells and exacerbate insulin resistance (13).Moreover, accumulated reactive oxygen species also accelerate aging-related DNA damage and induce cellular senescence (14,15).With increasing age, the free radical dynamic balance in cells is gradually broken, causing an increase in free radical concentration and inducing the oxidation reaction, leading to T2DM (16).In addition, oxidative stress is closely interrelated with inflammation (17) by activating multiple transcription factors in the inflammatory response (18).Furthermore, abnormal oxidative stress dysregulates the balance of energy metabolism during T2DM development (19)(20)(21)(22)(23).In summary, the potential mechanism by which aging-related oxidative stress (often described as "oxidative aging" (24)) triggers T2DM needs to be further studied at the system level (Figure 1A).
With the development of artificial intelligence, many research results on diabetes have utilized machine learning (ML), which can gain useful information from original profiles.ML can be widely used in the risk prediction, prognosis, and treatment of clinical diseases such as cardiovascular disease and cancer (25,26).Recently, it was reported that ML can predict the occurrence of T2DM and its complications, as well as identify key markers in T2DM (27)(28)(29).Additionally, Mendelian randomization (MR) is conducive to integrating biological information (30,31).Although numerous studies have revealed some risk factors/mechanisms associated with T2DM, the underlying mechanism between oxidative aging and T2DM is still unclear and requires further exploration.
To further explore the potential mechanisms between oxidative aging and T2DM, a series of computational studies was performed in this paper (Figures 1B, C): (1) Machine learning was used to identify aging and disease (T2DM) markers.(2) An integrative model was built to further explore essential relationships between oxidative aging (aging-related oxidative stress) and T2DM (Figure 1C).(3) Network analysis, enrichment analysis and sensitivity analysis were used to investigate the underlying mechanisms between oxidative aging and T2DM markers.(4) Relative biological functions of identified oxidative aging markers were further validated across different cancer types.As a result, the underlying mechanisms of T2DM (i.e., nutritional metabolism, inflammatory response, mitochondrial function and protein homeostasis) were integrated, which can also provide key indices in cancers.

Modeling prediction models and identifying relative biomarkers
The gene expression profiles were obtained from the GEO database, including 489 samples and 12,958 genes (Tables S1-S3).These genes were ranked by the ReliefF algorithm, and then the aging predictor and disease predictor were built using the k-nearest neighbors (kNN; k=3 with the correlation distance) algorithm, optimized by 10-fold cross-validation.The accuracy of the aging predictor in the test set was 0.70455 and 0.7279 in the aging and disease predictors (Figure 1; Table 1), respectively.Furthermore, the ROC area under the curve (AUC) for the aging and disease predictor models were 0.7712 and 0.72788 (Figure 2), respectively.As a result, our predictors were sufficiently accurate in both aging and disease models.
Both aging and disease markers have meaningful biological functions.For example, OSBPL1A (oxysterol binding protein-like 1A, ReliefF weight=0.058)was the top aging marker.OSBPL1A is one of a set of intracellular lipid receptors and is closely related to lipid metabolism and cholesterol metabolism (32,33).TIGD4 (tigger transposable element derived 4, ReliefF weight=0.0253),as the top disease marker, was related to glycogen metabolism.In sum, the abnormal metabolism of lipids, cholesterol and glycogen can lead to T2DM (34).These results indicated the crucial role of energy metabolism in T2DM.

Identifying the oxidative-aging risk factors by the integrated prediction model
The integrated oxidative aging model was built to explore essential relationships among aging, oxidative and T2DM markers (details are shown in Materials and Methods 5.3, with a total of 11829 "aging-oxidative-disease" triples).The top 10 aging, oxidative and disease markers are shown in Table 2, including relative experimental details (35)(36)(37)(38)(39)(40)(41)(42)(43).For example, ADP-ribosylarginine hydrolase (ADPRH) is the top aging marker, participating in the regulation of various cellular processes, including both immunity and aging (44).ADPRH adversely influences the immune system via CD8+ T cells, hence promoting an imbalance in energy metabolism (45).TPST1 (tyrosyl protein sulfotransferase 1) is the top disease marker, catalyzing the posttranslational sulfation of tyrosine residues within acidic motifs of many polypeptides in all multicellular organisms (46).TPST1 promoted the secretion of some cytokines and then induced the inflammatory response (47).COX5A (cytochrome C oxidase subunit 5A) is the top oxidative marker related to mitochondrial function (48), which induces an imbalance in energy metabolism and insulin resistance (35).In addition, the predictor accuracy calculated by the selected disease markers was 0.7662 (Table 1).In sum, these results indicated that the integrated oxidative aging model could identify essential relationships in T2DM, even with enough prediction ability.

Sensitivity analysis further highlighted the imbalance of energy metabolism in T2DM
The Markov chain Monte Carlo (MCMC) method was used to evaluate the sensitive relationship between oxidative aging and T2DM.As a result, a series of triples were identified as key components (2501 out of 11829) in the integrated oxidative aging model.
The top 10 sensitive relationships (by calculating the absolute differential frequency) are shown in Table 3, where the top relationship was "OSBPL7-COX7C-TM6SF1" (difference=-0.03935).Additionally, Table 3 also displayed experimental details of relative oxidative markers (49-55).OSBPL7 (oxysterol binding protein like 7) is an oxysterol-binding protein-like (OSBPL) family member involved in lipid binding and transport and induces cholesterol efflux (56,57).COX7C (cytochrome C oxidase subunit 7C) is an enzyme in the electron transport chain related to cellular respiration and is also a potential biomarker of diabetes mellitus (58,59).Transmembrane 6 superfamily member 1    (43) Cell culture of human fibroblasts (TM6SF1) participates in regulating transmembrane transport in macrophages (60).Overall, these results indicated that oxidative stress played an important role in the development of T2DM.The top sensitive aging, disease, oxidative markers (evaluated by the occurrence times, also along with relative experimental details (61-69)) and are also shown in Table 4.For example, the top aging marker was HPS1 (Hermansky-Pudlak Syndrome 1 gene), inducing the biogenesis of lysosome-associated cellular organelles (70), which regulates the aging process through sphingolipids (71).The top disease marker was PPP1R15A (protein phosphatase 1 regulatory subunit 15A).PPP1R15A plays an important role in insulin resistance via energy metabolism (72,73).The top oxidative marker was ATOX1 (antioxidant 1 copper chaperone).It has been reported that ATOX1 can regulate the copper level in the cell and maintain the redox balance as a defense antioxidant (74, 75).In short, the sensitivity analysis emphasized the crucial relationship among aging, oxidative stress and T2DM.

Underlying oxidative-aging mechanisms based on enrichment analysis
To further explore the underlying mechanisms between oxidative aging and T2DM, the shortest path between each pair of oxidative aging and disease markers was identified, and then enrichment analysis was performed based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and biological process (BP) terms in Gene Ontology (GO).As a result, relative enrichment results were summarized in Figures 3, S1, as well as Tables 5 (75)(76)(77)(78)(79)(80)(81)(82)(83)(84)(85)(86)(87)(88) The top 10 KEGG pathways are shown in Tables 5, S3.The most enriched KEGG pathway was "Parkinson's Disease" (enriched in 1213 shortest paths).It has been reported that Parkinson's disease (PD) and T2DM have common pathological mechanisms (76-78, 111).For example, oxidative stress and mitochondrial dysfunction are involved in both T2DM and PD pathogenesis (77).Strikingly, there are also a series of common biological pathways in T2DM, PD and cancer, such as mitochondrial dysfunction and protein homeostasis (112).Furthermore, the most significant KEGG pathway with the minimum FDR was "Leishmania Infection" (FDR=0.0000629)(Figure 3B), indicating the inflammatory response in the immune system (82,113,114).Notably, the inflammatory response is also often closely related to cancer (113).The classical aging pathway, the "mTOR signaling pathway" was also enriched in shortest pathway (Figure S2), indicating the interrelationship between oxidative aging and T2DM.
The top 10 BP terms are shown in Tables 6, S4.For example, the top enriched BP term was "Regulation of aerobic respiration" (enriched in 29 shortest paths), which was related to energy and mitochondrial function (96).In addition, reactive oxygen species (ROS) are byproducts of aerobic respiration that control various cellular functions (97).The BP term with the minimum FDR was "Regulation of DNA binding" (FDR=0.0000282)(Figure 3A), which is vital to T2DM by dysregulating mitochondria and energy metabolism (100).Obviously, the accumulation of DNA damage is also a hallmark of cancer (115).Overall, these results identified various aspects of risk factors for T2DM, such as oxidative stress, aging, energy metabolism and immune systems.

Network markers revealed key mechanisms between aging and T2DM
Network markers were identified by calculating the betweenness in the shortest path of each ''oxidative-disease'' pair, where the top markers are shown in Table 7.For example, the top network marker was SCD (stearyl-coenzyme A desaturase), which is mainly expressed in adipose tissue and can catalyze the synthesis of monounsaturated fatty acids (116).In addition, SCD can affect lipid metabolism and Enrichment analysis of the shortest path of KEGG and BP (A).The top 10 pathway enrichment with the minimum FDR in BP terms (B).The top 10 pathway enrichment with the minimum FDR in KEGG pathways.
mediate steroidogenesis, playing an important role in insulin resistance (117,118).Furthermore, SCD participates in mediating the inflammatory reaction, which promotes the progression of cancer (119).Moreover, there were also a series of shortest paths through SITR1 (Figure S3, where permutation p-value=0.002and 0, before and after sensitive analysis), which was as a clssical aging marker.Thus, network markers indicate the crucial role of oxidative stress dysfunction, along with energy metabolism, in T2DM.

Pan-cancer analysis further verified the mechanism of oxidative aging in T2DM
Pan-cancer analysis was used to further verify the relative functions of T2DM oxidative aging markers in cancer.For example, oxidative aging markers in the integrated model were used to evaluate the survival index across different cancer types.There were 9 out of 15 cancer types with significant results (including COAD, ESCA, KIRC, LIHC, LUAD, LUSC, PRAD, THCA and UCEC, shown in Figure 4).These results suggest that oxidative aging markers can also be used as relative risk factors in cancer.
Additionally, both the commonality and specificity across 15 cancer types were investigated based on enrichment analysis.The top 10 common KEGG pathways are shown in Figures 5, S4, where "Alzheimer's Disease" was the top KEGG pathway.Alzheimer's disease (AD) and cancer share common risk factors.For example, aging is one of the greatest risk factors for the development of Alzheimer's disease, and the risk of cancer also increases with increasing age (120).In addition, some cancer patients may have a higher risk of Alzheimer's disease (121).Figures 6, S2 showed the top 10 common BP terms in 15 cancers."Regulation of cellular respiration" was the top BP term, indicating the key role of energy metabolism in cancer (122).Cellular respiration participates in energy metabolism and is also a hallmark of many cancers (123).The specific enrichment results within each cancer are also summarized in Tables 8, 9, S6, S7 (112,, indicating a series of oxidative aging-related risk factors in cancer, such as the The results of survival analysis across different cancer types.inflammatory response, energy metabolism and mitochondrial function.Overall, our results highlighted a series of crucial functions related to oxidative aging, which can also be used to study potential mechanisms in cancer.

Discussion
It is well known that aging-related oxidative stress plays a crucial role in T2DM (3).However, the essential relationship among aging, oxidative stress and T2DM still needs to be explored in more depth.In this paper, a series of computational methods were performed to explore these relationships in T2DM as well as the relative mechanisms.First, both the aging model and disease model were optimized, and relative aging markers and disease markers were identified.Next, the integrated oxidative aging model was built to identify essential "aging-oxidativedisease" relationships.Finally, network analysis, enrichment analysis, sensitivity analysis and pan-cancer analysis were used to further explore the potential mechanisms between oxidative aging and T2DM.As a result, various risk factors in T2DM were integrated.
Our results highlighted that energy metabolism was vital to the development of T2DM.For example, the integrated oxidative aging model identified a series of key markers in T2DM that were closely related to energy metabolism.OSBPL1A and T1GD4 participate in nutritional metabolism; the former is mainly involved in lipid metabolism and cholesterol metabolism, and the latter is mainly related to glycogen metabolism (32)(33)(34).ADPRH and PPP1R15A can lead to energy metabolism imbalance (35,63).COX5A can affect mitochondrial function, and ATOX1 is the redox catalyst, both of which can affect energy metabolism through mitochondrial dysfunction (39, 65).Furthermore, as the top network marker, SCD is mainly expressed in adipose tissue and can catalyze the synthesis of monounsaturated fatty acids (116).It can affect lipid metabolism and mediate steroidogenesis, which plays an important role in insulin resistance (117,118).SIRT1 was also identified by calculating the betweenness.In MCMC, the greatest difference in the absolute value pair was "OSBPL7-COX7C-TM6SF1", where OSBPL7 participates in lipid binding and transport (49, 50) and COX7C is related to cellular respiration as a potential biomarker of diabetes (51,52).The classical energy metabolism pathway, "mTOR signaling pathway", was also identified using the enrichment analysis, indicating the key interaction between oxidative aging and T2DM.
Protein homeostasis is also involved in the progression of T2DM.For instance, amyloid precursor protein (APP) is an The enrichment analysis shared by cancers.oxidative marker identified by MCMC that promotes the secretion of amyloid proteins (164).SPI1 (Spi-1 Proto-Oncogene) was involved in the negative regulation of protein, which caused restraint of aerobic glycolysis (165) (Figure 3).In summary, both APP and SPI1 are related to protein homeostasis and even accelerate the development of both T2DM and neurodegenerative diseases (NDs).That is, protein homeostasis is a common mechanism in both T2DM and ND (166,167).The inflammatory response also plays an important role in the development of T2DM.For example, the aging marker HPS1 affects the biogenesis of lysosome-associated cellular organelles and even participates in regulating cellular inflammation (61, 62).The disease marker TPST1 induces the secretion of some cytokines, along with the inflammatory response (37,38).TM6SF1, as one of the key markers identified by MCMC, was involved in transmembrane transport in macrophages, thus highlighting the key role of the immune system in T2DM (53).
Furthermore, there are a series of experiments and relative clinical stastic results also revealed significant relationships between the identified oxidative aging markers and T2DM.For example, it has been reported that in vitro oxidative stress in mammalian skeletal muscle leads to substantial insulin resistance to distal insulin signaling and glucose transport activity (p=9.2e-05)(168).Chronic oxidative stress can also leads to decreased responsiveness to insulin, ultimately leading to diabetes reported by Alina Berdichevsky et al (p=0.01)(169).Besides, NFKBIA affects the T2DM is associated with an increased risk of developing cancers, such as COAD, PRAD, and THCA (30).It is well known that T2DM and cancer have common risk factors, such as oxidative stress, energy metabolism, inflammation and protein homeostasis (22,23,173).Our results also proved that inflammation and energy metabolism were common risk factors in cancers, and even survival analysis further verified the key role of oxidative aging markers across different cancer types.Oxidative stress may lead to chronic inflammation, which in turn can induce most chronic diseases, including both cancer and T2DM.In addition, oxidative stress can damage the normal function of mitochondria as well as energy metabolism, which plays an important role in the development of T2DM and cancer.In short, various risk factors related to oxidative aging were also confirmed in cancer.
According to the oxi-inflamm-aging theory, the aging process is regulated by chronic oxidative stress, as well as the inflammatory response (174).It is well known that dysregulated oxidative stress triggers a series of signaling pathways, thus leading to pancreatic beta cell damage (175).In addition, the cellular senescence theory also highlights cellular inflammation and the oxidative stress response during the aging process (176,177).That is, cellular senescence may also play an important role in the pathogenesis of T2DM (i.e., through the mTOR signaling pathway) (177, 178).Furthermore, these risk factors even interact with each other and then promote T2DM.For example, the imbalance of energy metabolism could interact with a series of pathways, such as lipid accumulation, chronic inflammation and insulin resistance, triggering T2DM progression (179).It has been reported that normal homeostasis in the insulindriven immunometabolic network is vital to the preservation of insulin sensitivity in healthy aging (180).Here, our work also highlighted the interaction between the immune system and energy metabolism in the development of T2DM (Figure 3; Tables 5, 6), which is also crucial in cancer (Figures 4, 5).With the help of the integrated oxidative aging model, our study revealed that oxidative stress was interrelated with various aging-related risk factors in T2DM (Tables 2-6), such as the inflammatory response, mitochondrial function and protein homeostasis.These results further confirmed both the oxi-inflamm-aging and cellular senescence theories.Overall, potential aging-related mechanisms in T2DM were integrated in the context of oxidative stress (Figure 6).

Conclusion
In this study, machine learning was performed to predict aging and T2DM, and then relative biomarkers were identified.An integrated oxidative aging model was built to explore the essential relationship between oxidative aging and T2DM.The key roles of nutritional metabolism, the inflammatory response, mitochondrial function and protein homeostasis in T2DM were highlighted in our work with the help of sensitivity analysis, enrichment analysis, network analysis and pan-cancer analysis.In conclusion, various risk factors were integrated in the development of T2DM as well as cancer based on oxidative aging.
The gene expression profiles were processed as follows: (1) Only the samples with both the age and phenotype index (i.e., type 2 diabetes versus control) were retained; otherwise, they were deleted.
(2) The gene expression matrix for each dataset was integrated by summarizing the probe number within the gene symbol.
(3) The total data matrix was integrated, and the missing gene expression values were filled with values of 0.
(5) The gene expression matrix was transformed by logarithmic transformation if it contained outliers.
(6) Based on the mean and the standard deviation of gene expression for control individuals, the z-score normalization was performed for both T2DM and control samples.
(7) The singular value decomposition (SVD) method was performed to eliminate the intersample variation based on the top three principal components of the control samples.
(8) The z score was then utilized to normalize all samples based on the mean and the standard deviation of the control samples.
(9) The training set and the test set were randomly divided according to a ratio of approximately 2:1.
As a result, a total of 489 samples were obtained, including 208 samples of healthy aged people (age > 50 years old, 145 S1-S3).

Modeling the aging model and disease model
After randomization as well as a random disorder, the healthy population samples were divided into a training dataset and a test dataset.The ratio of training dataset samples to test dataset samples was close to 2:1.The ReliefF algorithm was used to select key features, and then the first 500 models were studied to train predictors.The optimal model was selected by 10-fold crossvalidation.To verify the accuracy of the aging predictor, the selected model was verified in the test dataset.
(1) In the aging model, the normal aged group (age>50) was labeled 1, and the young healthy group (age ≤ 50) was labeled 0; in the disease model, the T2DM group was labeled 1, and the control group (age ≤ 50) was labeled 0.
(2) The 12958 genes were sorted by the ReliefF algorithm; (3) The predictor was generated using the k-nearest neighbor (kNN, k=3, correlation distance) algorithm.The optimal model was selected by 10-fold cross-validation, where the model with the highest accuracy rate was chosen.
(4) The identified features were considered aging and disease markers.As a result, 304 aging markers and 299 disease markers were identified.

Identifying essential relationships in T2DM by an integrated oxidative aging model
The integrated oxidative aging model was built to identify the essential relationship among aging, oxidative stress and T2DM.The computational pipeline was referred to by Mendelian randomization (MR), although it was not as strict as MR (Figure 1C).
In this model, the aging-related oxidative stress markers were considered oxidative aging markers, where the relative aging/ disease markers were identified in "Methods 5.2".As a result, the essential relationships among aging, oxidative stress and disease (T2DM) markers were identified as key "aging-oxidative-disease" triples in T2DM.
MR is a statistical method for assessing the causal relationship between risk factors and outcomes based on observational data (181,182).The causal relationships between the instrumental variables, risk factors, and outcome variables were assessed as follows.
(1) There was a correlation between the instrumental variable and the risk factor.
(2) There was no correlation between the instrumental variable and the confounding factor.
(3) There was no correlation between the instrumental variable and the outcome variable after deleting the effect from the risk factor.
Here, the aging marker was used as the auxiliary variable (similar to the instrumental variable in MR), and the oxidative stress markers were used as the candidate risk factor.Then, agingrelated oxidative ("oxidative aging") markers were identified as the risk factor, and disease markers were used as the outcome variable.That is, the integrated oxidative aging model aimed to explore essential relationships among aging, oxidative stress and disease markers in T2DM.This model was performed as follows: (1) Oxidative markers were obtained as candidate risk factors based on Biological Processes (BP) of Gene Ontology (GO) through the Gene Set Enrichment Analysis (GSEA) platform (http:// www.gsea-msigdb.org/gsea/downloads.jsp, "OXIDATIVE" was taken as the keyword).As a result, 310 candidate oxidative markers were selected.
(2) The correlation (differential coexpression) pattern was used to select aging markers that strongly correlated with candidate oxidative stress markers with the help of the Kruskal−Wallis test.Here, the differential coexpression was calculated as follows: p = Kruskal − Wallis test (aging _ marker: * oxidative _ marker, phenotype) (1) where the phenotype could be defined as 1 (T2DM) and 0 (control).
Furthermore, both a p-value<0.05and Benjamini−Hochberg false discovery rate (FDR)<0.1 were used to select strongly correlated aging markers.
(3) To reduce the correlation between the auxiliary variable (aging marker) and confounding factors, as well as further select a strong correlation between the aging marker and the candidate oxidative marker, a permutation test was performed by generating the simulated aging markers from the same number of randomly selected markers to each candidate oxidative marker; this process was repeated 1000 times, and then the p-value was calculated as the proportion of occurrence times (larger than the real mean difference) of the absolute difference between T2DM and control in 1000 permutations.The relationship between each aging marker and the candidate oxidative marker was retained if the permutation P<0.05.
(4) Correlation (differential coexpression) was used to select oxidative markers that strongly correlated with disease markers with the help of the Kruskal−Wallis test.Here, the differential coexpression was calculated as follows: p = Kruskal − Wallis test (oxidative _ marker: * disease _ marker, phenotype) (2) where the phenotype could be defined as 1 (T2DM) and 0 (control).
Furthermore, both a p-value<0.05and Benjamini−Hochberg false discovery rate (FDR)<0.1 were used to select strongly correlated oxidative markers.
(5) To reduce the correlation between the risk factor (oxidative marker) and confounding factors, as well as further select a strong correlation between the oxidative marker and the disease marker, a permutation test was performed by generating the simulated oxidative markers from the same number of randomly selected markers to each disease marker; this process was repeated 1000 times, and then the p-value was calculated as the proportion of occurrence times (larger than the real mean difference) of the absolute difference between T2DM and control in 1000 permutations.The relationship between each aging marker and the candidate oxidative marker was retained if the permutation P<0.05.(6) The direct relationships for any other factors (genes) were found to reduce the correlation between the auxiliary variable (aging marker) and confounding factors.If there was another factor (gene) that was directly correlated (differentially coexpressed) to both the aging marker and the disease marker, then the relationship from aging to disease was deleted.
where the phenotype could be defined as 1 (T2DM) and 0 (control).
Furthermore, both a p-value<0.05and Benjamini−Hochberg false discovery rate (FDR)<0.1 were used to filter out any direct relationships.(7) To filter out the effect of horizontal pleiotropy, the agingdisease relationship was further examined by comparing the correlation between each aging and disease marker, through the oxidative marker or otherwise.Herein, steps ①-③ were used to calculate the correlations between auxiliary variables and outcome variables without the background of the risk factor, and step ④ was used to calculate the correlations between auxiliary variables and outcome variables with the context of the risk factor.
① The residual of each disease marker ("residual A") was calculated based on the oxidative marker: where b 2 is the regression coefficient.③ The abovementioned two residuals were further compared, and the residual of the disease marker was calculated (as "residual C"): where b 3 is the regression coefficient.④ The residual of the disease marker ("residual D") was calculated based on the aging marker.
⑤ The difference (between "residual C" and "residual D") was tested between the T2DM and control subgroups using the Kruskal-Wallis test (P<0.05and FDR<0.1).
Finally, the essential relationship among the aging marker, oxidative marker and disease marker was retained.Thus, 11829 "aging-oxidative-disease" triples were identified, including 105 aging markers, 83 oxidative markers and 282 disease markers.Thus, these 83 oxidative markers were used as oxidative aging markers (risk factors), and 282 disease markers were also used to discriminate the T2DM phenotype.

Sensitivity analysis using the MCMC method
To further explore the relationship among aging, oxidative stress and T2DM, sensitivity analysis was performed based on the Markov chain Monte Carlo (MCMC) method, where "agingoxidative-disease" triples identified by MR were further evaluated as a candidate relationship.The MCMC method is used to sample certain posterior distributions in a high-dimensional space based on a given probabilistic background.The key step of MCMC is to construct a Markov chain whose equilibrium distribution is equal to the target probability distribution.The steps were as follows: (1) Constructing the transfer cores of the ergodic Markov chain.The prior distribution of each parameter was normally distributed based on all identified markers in each group (i.e., T2DM and control), respectively.
(2) Simulate the chains until equilibrium is reached.The Metropolis−Hastings sampling method was used to determine whether the new sample (q *) was acceptable based on the a value.a = P ( q * jX) * q ( q n !q * ) P ( q n j X ) * q ( q n !q* ) ( where P (q n | X) and P (q * | X) are the posterior probability of the nth accepted sample, the new sample q (q n !q *) is the transition probability from the nth accepted sample to the new sample, and q (q * !q n ) is the transition probability from the new sample to the n-th accepted sample.
In this work, the disease score was used to evaluate the simulated samples, with 1000 random samples used as candidate samples for each group (i.e., T2DM or control).The disease score was calculated by comparing the distance between normal and T2DM training samples based on the 282 disease markers identified by the integrated oxidative aging model: (3) Performing the global sensitivity analysis The correlation index was used to evaluate each "agingoxidative-disease" triple in the accepted samples (including both T2DM and control): correlation _ index = disease _ marker − aging _ marker oxidative _ marker − aging _ marker (10) As a result, the correlation index was calculated in each "agingoxidative-disease" triple for all accepted samples.Then, the Kruskal-Wallis test was used to evaluate each correlation index in each "aging-oxidative-disease" triple, where p-value<0.05and FDR<0.1 were set as the threshold.Finally, 2501 "aging-oxidativedisease" triples were identified as sensitive relationships, including 41 aging markers, 37 oxidative markers and 61 disease markers.

Constructing the differential coexpression network
To further reveal the relationship between "oxidative aging" and T2DM, a differential coexpression network was constructed by the following steps: (1) The Pearson correlation coefficient for each pair of genes was calculated based on the T2DM and control groups.
(2) The Benjamini−Hochberg FDR method was used to adjust the p-values of the correlation coefficient.
(3) The relationship between each gene pair was retained if the coefficient value in T2DM had the opposite sign (i.e., + or -) to that in control, as well as p< 0.05 and FDR< 0.1.
(4) The shortest path between each pair of oxidative aging and disease markers was selected based on the differential coexpression network using the Dijkstra algorithm.

Enrichment analysis
The gene functions were further explored by enrichment analysis of the shortest pathway.Gene Ontology (GO) terms and KEGG pathways for the GSEA platform were obtained from gene set enrichment analysis (http://software.broadinstitute.org/gsea/downloads.jsp,version 7.5).The hypergeometric distribution was used to test the degree of enrichment of the GO BP and KEGG pathways.Hypergeometric test formula: where N is the total number of genes in the gene set, M is the number of known genes (such as KEGG pathway or BP terms), which is the number of genes identified in each shortest pathway, and k is the number of common genes between known genes and candidate genes identified in each "oxidative-disease" shortest pathway.The pvalue of each path was controlled using the Benjamin-Hochberg method.Finally, pathways with p<0.05 and FDR<0.1 were retained.

Identifying network markers
The subnetwork with the shortest pathways among the selected "oxidative-disease" pairs was constructed, and genes in the subnetwork were sorted by their betweennesses in descending order.To test whether the top betweenness genes were hubs in the background network, we ran a permutation to count the occurrence time of the top genes in the shortest paths between randomly selected genes (containing the same numbers of "oxidative-disease" pairs, based on the identified "aging-oxidativedisease" triples) when they had greater betweennesses than those in our study.We repeated this process 1000 times, and the p-value was calculated as the proportion of occurrence times of the top betweenness genes in 1000 permutations.

Pan-cancer analysis
The survival analysis was performed based on the oxidative aging markers (identified by the integrated oxidative aging model in 5.3) for each cancer using the Kaplan−Meier method.The tumor samples of each cancer were divided into two groups based on the mean value of the oxidative aging markers.Then, the Kaplan−Meier method was used to evaluate the survival difference between these two groups, and the significance was estimated by the log-rank test.A p-value<0.05 was considered statistically significant.
Then, the differential expression networks were constructed for each cancer, where the details were also the same as 5.5.As a result, each shoreat pathway was selected from each pair of oxidative aging markers and differentially expressed genes (as disease markers in cancer) using the Dijkstra algorithm.Furthermore, enrichment analysis was performed by the "oxidative-disease" shortest pathway for each cancer type, where both p<0.05 and FDR<0.1 were used.

1 (
FIGURE 1 (A) Diagram of the hypothetical mechanism.(B) The workflow of our study.(C) The pipeline of integrated oxidative aging model.
Machine learning results.(A, B) Aging predictor from our previous study, selecting the number of aging markers.(C, D) The improved inflamm-aging predictor, selecting the number of disease markers.(A, C) Learning curve for the training dataset.(B, D) The ROC curve for the test dataset.

FIGURE 6
FIGURE 6 Summarized mechanisms of oxidative-aging in T2DM Rectangle genes represent aging markers, oval genes represent disease markers, rhombus genes represent oxidative markers, hexagon genes represent network markers with high numbers.Orange arrows indicate the gene involved in nutritional metabolism, yellow arrows indicate the gene involved in inflammation response, red arrows indicate the gene associated with mitochondrial function, green arrows indicate the gene associated with protein homeostasis.

TABLE 1
The accuracy of aging predictor and disease predictor.

TABLE 2 The
top 10 aging markers, disease markers and oxidative markers from the integrated oxidative model.

TABLE 3
The top 10 pairs with the greatest absolute difference frequency.

TABLE 4
The top 10 aging markers with the most paired with oxidative markers after sensitive analysis.

TABLE 5
The top 10 enriched KEGG pathways.

TABLE 7
The top 10 genes with the highest number before and after sensitive analysis.

TABLE 8
KEGG pathways in each cancer with the minimum FDR.
(1)age is the risk factor for the development of Alzheimer's Disease and cancer.(2)some cancer patients may have a higher risk of Alzheimer's Disease.

TABLE 9
BP terms in each cancer with the minimum FDR.In short, our results also presented key clinical indices with the help of the integrated oxidative model.