- 1Department of Urology, Affiliated Hospital 2 of Nantong University, Nantong, Jiangsu, China
- 2Jiangsu Nantong Urological Clinical Medical Center, Nantong, Jiangsu, China
- 3Department of Emergency Medicine, the Afffliated Suqian Hospital of Xuzhou Medical University, Suqian, China
- 4Department of Nephrology, Sir Run Run Hospital, Nanjing Medical University, Nanjing, China
- 5Department of Nephrology, Affiliated Hospital 2 of Nantong University, Nantong, Jiangsu, China
Background: Major depressive disorder (MDD) and uremia are two chronic wasting diseases that have interactive effects and significantly aggravate patients’ distress. However, the molecular basis linking these diseases remains poorly investigated.
Methods: Various machine learning algorithms were used to analyze transcriptome data from the Gene Expression Omnibus (GEO) datasets, including those from MDD and uremia patients, to develop and validate our model. After removing batch effects, differentially expressed genes (DEGs) were identified between each disease group and the control group. Functional enrichment analysis was then performed at the intersection of DEGs from the two diseases. In addition, single-sample gene set enrichment analysis (ssGSEA) quantitative immune infiltration analysis was conducted. The optimal diagnostic model of uremia was constructed by analyzing and verifying the training set with multiple combinations of 12 machine learning algorithms. Finally, potential drugs for uremia were identified using the “Enrichr” platform.
Results: According to enrichment analysis, a total of seven key genes closely related to MDD and uremia, mainly involved in the immune process, were identified. Immune infiltration analysis showed that MDD and uremia had different profiles of immune cell infiltration compared to healthy controls. Powerful diagnostic markers of seven genes (IL7R, CD3D, RETN, RAB13, TNNT1, HP, and S100A12) were constructed from these genes, and all showed better performance than published uremia diagnostic models. In addition, decitabine and nine other agents were found to be potential agents for the treatment of uremia.
Conclusion: Our study combined bioinformatics techniques and machine learning methods to develop a diagnostic model for uremia, focusing on common genes between MDD and uremia.
1 Introduction
Major depressive disorder (MDD) is a prevalent psychiatric disorder with a significant global impact, causing substantial disability and affecting everyday functioning (1). Its clinical symptoms include persistent depressed mood, anhedonia, fatigue, feelings of worthlessness, and impaired cognitive performance (2). Major depression is estimated to have a lifetime prevalence of up to 19% (3), placing a significant burden on society (4). It remains a challenge in the treatment of as many as half of the cases (5). Based on previous studies, uremia has a significant association with MDD. For example, studies conducted by Heng-Jung Hsu et al. showed that the incidence of depressive disorders was significantly higher in uremia patients (6). Depression can have a serious impact on people’s lives, even letting people give up life, so it is urgent to explore the association between uremia and depression.
Uremia is the final stage of chronic renal failure. It is clinically characterized by abnormal water, electrolyte, acid, and base balance and increased levels of metabolites (e.g., creatinine and urea) in the blood (7). The uremic phase is often associated with some secondary conditions and complications of chronic kidney disease (CKD), including renal function, circulatory system, endocrine, and metabolic disorders, as well as neuromuscular dysfunction and cognitive impairment (8, 9). Among them, MDD is a more common complication of uremia. Uremia is a chronic wasting disease that usually requires hemodialysis treatment, and since the introduction of dialysis, the mental health of hemodialysis patients has been the focus of research (10, 11). Kimmel et al. (12) demonstrated that persistent depression is a risk factor for death in hemodialysis patients. Therefore, it is crucial to construct a diagnostic model of uremia associated with major depression to control it in time at an early stage and reduce mortality. However, the diagnosis of Uremia mainly depends on serum creatinine and glomerular filtration rate, which makes the diagnosis of Uremia very lacking. In addition, although many genetic markers have been investigated, such as CNOT8, MST4, PPP2CB, PCSK7, and RBBP4, none of them could demonstrate enough specificity and sensitivity for clinical applications (13). Others have shown a bidirectional relationship between depression and physical diseases such as chronic kidney disease (14). Therefore, it is particularly important to construct a better diagnostic model that can be applied in clinical practice through the relationship between depression and chronic kidney disease for the early detection of uremia.
Bioinformatics and machine learning techniques have evolved significantly over the last decade, and this is how we can investigate potential biomarkers and therapeutics for diseases (15–18). In this study, we used multiple integrated bioinformatics tools to reveal hub genes and underlying mechanisms linking uremia and MDD by analyzing data from three uremia datasets and three MDD datasets selected from the Gene Expression Omnibus (GEO) database. We explored immune cell infiltration in uremia and MDD. In addition, 113 combined machine learning algorithm frameworks were used to construct a uremia diagnostic model.
2 Methods
2.1 Data collection
Appropriate datasets were filtered from the GEO database. First, datasets of transcriptomes for major depression and uremia or end-stage renal disease were searched. Then, because multiple datasets were included, the data in the dataset were kept as much as possible above 6. Finally, it was ensured that the included dataset was suitable for machine learning methods. Following the above steps, the following six datasets were obtained from the National Center for Biotechnology Information (NCBI) GEO (https://www.ncbi.nlm.nih.gov/geo/): GSE37171, GSE38750, GSE43484, GSE52790, GSE76826, and GSE98793 (9, 19–23). These datasets are described in detail in Table 1 and include the microarray platform, panel, and number of samples.
2.2 Removal of batch effect
Before performing analysis, we merged the three MDD datasets mentioned in Table 1 (GSE52790, GSE76826, and GSE98793). We then corrected batch effects using the “ComBat” function in the “sva” package (version 3.52.0) (24). We used principal component analysis (PCA) analysis to assess the validity of this correction. Using the same method, we then corrected three uremia cohorts (GSE37171, GSE38750, and GSE43484).
2.3 Determination of DEGs
In the analysis of the MDD and uremia datasets, the “Limma” package (25) within the R software was employed to identify differentially expressed genes (DEGs). Our selection criteria, requiring |log2FC| > 0.25 and p-value <0.05, ensured a comprehensive and accurate analysis. The outcomes were visually represented through compelling volcano plots, and the shared part of the two sets of DEGs was effectively depicted using Venn diagrams. To further investigate the shared genes, Protein-Protein Interaction Networks (PPI) networks were confidently generated using GeneMANIA, facilitating an insightful exploration of their associations (http://genemania.org/).
2.4 Enrichment analysis of common genes in uremia with MDD
To gain insights into the biological functions and mechanistic pathways of common genes, we utilized the “org.Hs.eg.db”, “ggplot2”, “clusterProfiler”, “enrichplot”, “GSEABase”, and “DOSE” packages to conduct Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses, as well as Disease Ontology Semantic and Enrichment (DOSE). p < 0.05 for enrichment assessment was considered significant.
2.5 Immune cell infiltration
The quantification of 23 infiltrating immune cells in both diseases was conducted using single-sample gene set enrichment analysis (ssGSEA). Then, the differential expression of immune cells in normal and uremia patients was further studied and analyzed.
2.6 Machine learning algorithms
Twelve machine learning algorithms were used to construct 113 different models: Lasso, Ridge, Stepglm, XGBoost, Random Forest (RF), Elastic Net (Enet), Partial Least Squares Regression for Generalized Linear Models (plsRglm), Generalized Boosted Regression Modeling (GBM), NaiveBayes, Linear Discriminant Analysis (LDA), Generalized Linear Boosting (glmBoost), and Support Vector Machine (SVM). First, the raw data were preprocessed to eliminate the influence of different feature scales. Then, the dataset was randomly divided into training and testing sets, 70% of which were training sets and 30% of which were testing sets. During the model training phase, a variety of machine learning algorithms were used to evaluate their performance. These models were trained on the training set, and the hyperparameters were optimized using cross-validation. During the model evaluation phase, Area Under Curve (AUC) values were calculated for each model using the test set (threshold set at 0.7) to measure their classification performance. Finally, AUC values were calculated for each model using the RunEval function, and heat maps were generated using the SimpleHeatmap function to visualize the performance of each model. The model with the highest AUC value was selected as the best model (26). In addition, calibration curves were used to assess the predictive performance of our diagnostic model, Decision Curve Analysis (DCA) curves were generated to assess the clinical utility of the model, and Nomo plots were also generated to calculate the probability of disease occurrences. Finally, the DeLong test was used to compare our model’s diagnostic performance with that of two existing uremia diagnostic models (13, 27).
2.7 Candidate drug identification
To explore drugs that may target the mechanisms of action in uremia and MDD, we utilized the Drug Signatures Database (DSigDB) within the Enrichr web platform (https://maayanlab.cloud/Enrichr/).
2.8 Statistical analysis
Statistical analyses were performed using the R software version 4.4.1. An unpaired Student’s t-test compared differences between the two groups. p < 0.05 was considered statistically significant.
3 Results
3.1 Data processing
The study design flowchart is shown in Figure 1. Original MDD and control transcriptome data were obtained from GEO, integrated after removing batch effects, and standardized MDD case and control processing cohorts were generated (Figures 2A, B). Similarly, the post-batch corrections of the original uremia and control cohorts were combined (Figures 2C, D) to obtain a standardized validation cohort with markedly reduced batch effects.

Figure 2. The integration of MDD datasets and uremia datasets. (A) PCA of three raw MDD datasets without batch effect correction. (B) PCA of the integrated MDD dataset after batch effect correction. (C) PCA of three original uremia datasets before batch effect correction. (D) PCA for the combined uremia dataset after batch effect correction. MDD, major depressive disorder.
3.2 Identification of differential expression associated with uremia and MDD
Based on the relationship between MDD and uremia, limma analysis was performed for the uremia (GSE37171, GSE38750, and GSE43484) and MDD (GSE52790, GSE76826, and GSE98793) cohorts to identify causative genes for MDD-associated uremia. Among the 4,209 DEGs detected in the uremia cohort, 1,871 genes showed upregulated expression, while 2,338 were downregulated (Figure 3A). The MDD cohort yielded 25 DEGs, 15 of which were upregulated and 10 downregulated (Figure 3B). The DEGs of uremia and MDD were intersected to obtain seven shared genes for constructing a diagnostic model of uremia (Figure 3C).

Figure 3. Identification of DEGs. (A) Volcano plots describing DEGs between uremia and healthy controls. (B) Volcano plots showing DEGs between MDD and healthy controls. (C) Venn diagram revealing seven DEGs shared between uremia and MDD. DEGs, differentially expressed genes; MDD, major depressive disorder.
3.3 Functional enrichment of the shared genes
The PPI networks of the shared genes were established from the GeneMANIA database (Figure 4A), and then GO, KEGG, and DOSE were used for functional enrichment analysis and to search for potential pathogenic mechanisms. GO enrichment analysis showed overexpression of biological processes, including defense response to bacterium, T-cell differentiation in the thymus, cell killing, gonad development, development of primary sexual characteristics, negative regulation of T cell-mediated cytotoxicity, response to insulin, positive regulation of T-cell differentiation in the thymus, positive regulation of steroid hormone secretion, and luteinization. Enriched cellular components included endocytic vesicle, clathrin-coated vesicle, coated vesicle, secretory granule lumen, cytoplasmic vesicle lumen, vesicle lumen, specific granule lumen, clathrin-coated endocytic vesicle membrane, clathrin-coated endocytic vesicle, and clathrin-coated vesicle membrane. Overexpressed molecular functions included receptor for advanced glycation end products (RAGE) receptor binding, tropomyosin binding, copper ion binding, calcium-dependent protein binding, antioxidant activity, cytokine receptor activity, hormone activity, immune receptor activity, serine-type endopeptidase activity, and antigen binding (Figure 4B). KEGG pathway analysis further revealed primary immunodeficiency, hematopoietic cell lineage, PD-L1 expression and PD-1 checkpoint pathway in cancer, Th1 and Th2 cell differentiation, Chagas disease, and T-helper 17 (Th17) cell differentiation (Figure 4C). Disease Ontology Semantic and Enrichment analysis also showed Kawasaki disease, lymphadenitis, lymph node disease, atherosclerosis, arteriosclerotic cardiovascular disease, lymphatic system disease, arteriosclerosis, pulmonary artery disease, pulmonary embolism, human immunodeficiency virus infectious disease, inflammatory bowel disease, severe combined immunodeficiency, gestational diabetes, liver cirrhosis, intrinsic cardiomyopathy, combined immunodeficiency, intestinal disease, acute myocardial infarction, sarcoidosis, myocardial infarction, cardiomyopathy, non-alcoholic fatty liver disease, neuropathy, hypertrophic cardiomyopathy, middle cerebral artery infarction, hypersensitivity reaction type IV disease, coronavirus infectious disease, hypersensitivity reaction disease, colitis, and hyperglycemia (Figure 4D).

Figure 4. PPI network analysis and enrichment analysis. (A) PPI network of seven shared genes constructed using GeneMANIA. (B) Bar plots of GO enrichment analysis results for biological process, cellular component, and molecular function. (C) Bar plots of KEGG pathway enrichment analysis. (D) Bar plots of Disease Ontology enrichment analysis.
3.4 Analysis of immune cell infiltration in uremia and MDD
The enrichment analysis of the shared genes between uremia and MDD showed a significant association with immune cell infiltration and the development of inflammation. ssGSEA was used to describe the composition of immune cell subsets in the uremia and MDD cohorts. MDD samples exhibited decreased activated CD8 T cells and increased activated dendritic cells relative to control samples (Figure 5A). The box plot (Figure 5B) indicates that compared to controls, in the uremia cohort, activated dendritic cells, macrophages, monocytes, natural killer cells, and type 17 T-helper cell proportion increased, while activated CD4 T cells, immature dendritic cells, natural killer T cells, plasmacytoid dendritic cells, T follicular helper cells, type 2 T-helper cells, and gamma delta T-cell abundance decreased.

Figure 5. Immunological features of MDD and uremia. (A) Boxplots comparing immune cell abundances between MDD and controls. (B) Boxplots comparing immune cell abundances between uremia and controls. *** p < 0.001,** p < 0.01, and *p < 0.05. MDD, major depressive disorder.
3.5 Identification of diagnostic hub genes by machine learning and establishment of a diagnostic model for MDD-associated uremia
The most robust diagnostic model, based on seven shared genes, was constructed by reducing selection bias using 113 combinations of 12 machine learning algorithms. The analysis was performed on a training dataset that randomly assigned 70%, and the remaining 30% test set was used to evaluate the predictive performance of diagnostic models (Figures 6A, B). By integrating the Lasso and GBM algorithms, the final model that showed the best performance was built. The AUC value of the Receiver Operating Characteristic (ROC) curve was obtained to be 0.941, and the constructed model had superior predictive performance. The Lasso + GBM algorithm identified seven key genes (IL7R, CD3D, RETN, RAB13, TNNT1, HP, and S100A12). The calibration curve of our diagnostic model, such as 6C, the bias-corrected line obtained by bootstrap sampling, was close to the ideal diagonal of the cohort, visually showing that the predicted probability of the model was highly consistent with the actual observed probability, and once again proved the accuracy of the model. A DCA curve analysis was also conducted (Figure 6D), the curve shows that from a threshold probability of approximately 0.2, the net gain of intervention according to the prediction model begins to be significantly higher than that of complete intervention or no intervention. Although the net gain decreases with increasing threshold probability values, it is still significantly stronger than that of intervention with full or no, so it can be seen that this model has a practical application value. Finally, as shown in Figure 6E, the integration analysis of the seven genes established a Nomo plot, facilitating a more convenient estimation of the probability of having uremia based on patient test results in clinical practice.

Figure 6. Diagnostic performance of our model. (A) A total of 113 machine learning algorithm combinations evaluated by 10-fold cross-validation. (B) ROC curves for the training cohort. (C) Calibration curve for the training cohort. (D) DCA curves for the training cohort. (E) Nomo plot of the training cohort.
3.6 Subgroup analysis of uremia diagnostic model
We performed a subgroup analysis of the predictive model that demarcated age by 50 (Figures 7A–D). In contrast, the diagnostic performance of the predictive model was higher in the age > 50 group, with AUC values reaching 0.962 (Figure 7C), and ROC curve analysis was also performed for each gene. It can be seen that S100A12 has the highest predicted AUC value regardless of age (Figure 7D). In addition to age, we also analyzed gender (Figures 7E–H). We found that the accuracy of predicting men was higher than that of women, but weaker than that of the overall prediction model (Figures 7E, G). Then, we analyzed each gene (Figures 7F, H) and found that S100A12 still had the highest AUC value.

Figure 7. ROC curves for model in subgroups. (A, B) ROC curves for the young (age ≤ 50 years) subgroup of model (A) and each gene (B). (C, D) ROC curves for the old (age > 50 years) subgroup of model (C) and each gene (D). (E, F) ROC curves for the male subgroup of model (E) and each gene (F). (G, H) ROC curves for the female subgroup of model (G) and each gene (H).
3.7 Comparison of uremia diagnostic models
Because of the developments of bioinformatics and big data research technology, many diagnostic models for uremia have recently been developed that combine machine learning methods. Comprehensively comparing the performance of our model with that of other models, it was found that our prediction model performed better than both of them in comparison with the Zeng model (13) of network-based variable selection method (Figure 8A) and the Xi model (27) analysis of cellular senescence-associated genes (Figure 8B).

Figure 8. Comparison of diagnostic gene expression features in uremia. (A) ROC curves comparing our model to Xi et al. model in the uremia dataset. (B) ROC curves comparing our model with the Zeng et al. model in the uremia dataset.
3.8 Candidate drug identification
Genes in diagnostic models were analyzed using the DSigDB drug database on Enrichr to find potential targeted drugs. The top 10 candidates were decitabine, retinol, atorvastatin, liothyronine, hexachloroethane, cholesterol, simvastatin and niacin, hydrocortisone, dexamethasone, and caspan (Table 2).
4 Discussion
Both uremia and MDD have a significant impact on the physical and mental health of patients, and extensive research has been conducted on the relationship between these two diseases (28, 29). However, further investigation is necessary to explore the genetic interaction between them.
The emergence of microarray and sequencing technologies has facilitated the exploration of disease processes and molecular landscapes. Furthermore, the increasing development of bioinformatics analysis and machine learning has allowed us to analyze massive datasets, explore meaningful biomarkers, understand the potential mechanisms of action of diseases, and develop promising therapeutic drugs. These advancements offer valuable perspectives on the advancement and novel avenues of complex diseases (30–32). As far as we know, this is the first study to use 12 machine learning algorithms combined with biological information to reveal MDD-associated uremic pathogenic genes. Furthermore, as our diagnostic model predominantly utilizes blood specimens from patients, assessing the levels of diagnostic genes in the blood can help estimate the risk of uremia. This offers a clinically easy-to-perform method for diagnosis. In conclusion, the diagnostic model we explored holds significant promise for achieving early screening of uremia patients and interventions, thereby improving the outcomes of uremia patients.
A total of 4,209 DEGs between uremia and normal, and 25 DEGs between MDD and normal were analyzed using GEO’s dataset. DEGs at the intersection of uremia and MDD were taken to obtain seven common risk genes. PPI networks were constructed using GeneMANIA. GO and KEGG enrichment analyses were then performed, and some biological behaviors and action pathways were found, suggesting a potential mechanism for uremia development and MDD development. GO enrichment analysis highlighted factors such as defense response to bacterium, T-cell differentiation in the thymus, cell killing, gonad development, development of primary sexual characteristics, negative regulation of T cell-mediated cytotoxicity, response to insulin, positive regulation of T-cell differentiation in the thymus, positive regulation of steroid hormone secretion, and luteinization. Enriched cellular components included endocytic vesicle, clathrin-coated vesicle, coated vesicle, secretory granule lumen, cytoplasmic vesicle lumen, vesicle lumen, specific granule lumen, clathrin-coated endocytic vesicle membrane, clathrin-coated endocytic vesicle, and clathrin-coated vesicle membrane. Overexpressed molecular functions included RAGE receptor binding, tropomyosin binding, copper ion binding, calcium-dependent protein binding, antioxidant activity, cytokine receptor activity, hormone activity, immune receptor activity, serine-type endopeptidase activity, and antigen binding. In addition, KEGG pathway analysis showed significant enrichment of pathways associated with primary immunodeficiency, hematopoietic cell lineage, PD-L1 expression and PD-1 checkpoint pathway in cancer, Th1 and Th2 cell differentiation, Chagas disease, and Th17 cell differentiation.
In the KEGG enriched pathway, there is a potential association between Th17 differentiation and the onset of uremia and MDD. Research indicates a significant increase in Th17 cells in the peripheral blood of MDD patients (33). Similarly, uremia patients show a discernible rise in these immune cells when compared to healthy controls, suggesting a possible correlation between uremia and the upregulation of Th17 cells (34). The lack of significant change in their levels following dialysis in observed patients does not exclude the potential for uremia progression linked to immune activation. Further comprehensive studies are necessary to clarify this relationship. Notably, previous research has shown that these cells play a role in advancing atherosclerosis (35, 36). Interleukin-17 (IL-17), produced by Th17 cells, has synergistic effects with tumor necrosis factor-α (TNF-α), which contributes to the pathogenesis of atherosclerotic vascular diseases by creating a pro-inflammatory microenvironment (37). The proliferation of these immune cells could not only contribute to the onset of uremia but also increase susceptibility to cardiovascular complications in affected patients. It is well established that they play a significant role in mediating autoimmunity (38, 39), which suggests that uremia may have some relationship with the primary immunodeficiency pathway. Additionally, Disease Ontology Semantic and Enrichment analysis indicates that uremia may be complicated by Kawasaki disease, lymph node disease, arteriosclerosis disease, pulmonary artery disease, pulmonary embolism, human immunodeficiency virus disease, and inflammatory bowel disease.
The occurrence and development of uremia may be associated with immune activation (40), so we analyzed the immune expression of uremia and found that in patients with uremia, activated dendritic cells, macrophages, monocytes, natural killer cells, and type 17 T-helper cell proportion increased, while activated CD4 T cells, immature dendritic cells, natural killer T cells, plasmacytoid dendritic cells, T follicular helper cells, type 2 T-helper cells, and gamma delta T-cell abundance decreased. Our immune cell analysis is also consistent with previous studies suggesting that several immune cells, including dendritic cells and macrophages, are activated and may contribute to the development of CKD or even uremia (41–43). Our analysis also suggests a decrease in several immune cells, possibly because the immune system of uremia patients is overactivated but functionally compromised. Still, few relevant studies prompt us to investigate the immune cell infiltration and mechanisms of uremia further.
Uremia is now being diagnosed at a more advanced stage, prompting a heightened emphasis on early detection and disease management. The application of machine learning techniques to construct diagnostic models for diseases and predict patient survival has gained significant attention. Nevertheless, successfully translating these methods into clinical practice while ensuring diagnostic and predictive accuracy presents a notable challenge. Certain studies have developed diagnostic models for uremia using specific algorithms and conducted screenings for differential genes. However, it is important to note that these endeavors may be susceptible to personal biases and inherent preferences (26, 44). Thus, we employed 113 combinations of 12 machine learning algorithms to compare their diagnostic performance and identify the best model that mitigates bias caused by these factors, and ultimately, we determined Lasso + GBM as the best model. This study approach significantly reduces the complexity of research and uncovers the most representative patterns, enabling the development of a streamlined and more meaningful model. To further analyze the performance of prediction models constructed using multiple machine learning algorithms, we selected two published uremic diagnostic models that associate with different functional genes. One was Zeng’s model (13), which included two GEO datasets, GSE37171 and GSE70528, and associated modules were identified using the Weighted gene co-expression network analysis (WGCNA) method, followed by Lasso regression, to identify five genes predictive of end-stage renal disease. The other was Xi’s model (27), which incorporated the GEO dataset GSE37171 to screen five predictive genes of end-stage renal disease associated with cellular senescence through a PPI interaction network. As can be seen from the results, our prediction model performs significantly better than the other two models. However, our model has two more genes than the other models, and this increase in the number of genes may bring a little difficulty in clinical practice. Future efforts should, therefore, focus on the simple and efficient analysis of more models, ensuring superior predictive performance and enabling widespread implementation in clinical settings.
It is important to note that in a previous study, haptoglobin (HP) in seven model genes that comprise our diagnostic model is linked to hemolytic uremia (45). Mouse experiments have shown that mice with hemolytic uremic syndrome (HUS) lacking haptoglobin have a 25% reduction in survival compared with normal mice. When low doses of haptoglobin were administered to Shiga toxin-challenged wild-type mice, it reduced renal platelet deposition and neutrophil recruitment, suggesting that haptoglobin has beneficial effects, at least partly. Additionally, S100A12 has been found to be a strong predictor of cardiovascular mortality in end-stage renal disease (46–48). It has been discovered that RAGE triggers pro-inflammatory pathways upon the activation of S100A12, and the S100/RAGE interaction accelerates the development of cardiac hypertrophy and diastolic dysfunction in mouse models of CKD (49), further increasing mortality in uremia patients. TNNT1 has been associated with myopathy and even some cancers, but there are no definitive results on the mechanisms affecting uremia, which need to be further investigated. RAB13, which is mainly associated with the trafficking of intracellular material and the functional regulation of organelles, is similar to TNNT1, and the relationship to the role of uremia is unknown. IL7R and CD3D have been found to have a possible relationship with nephropathy, especially diabetic nephropathy, in previous studies, and similarly, RETN (resistin) has been found to play a role in diabetic nephropathy as well as renal insufficiency, but unfortunately, studies have not involved pathogenesis, and basic experiments are also needed for further exploration. Recently, experts have found common pathways and protein expressions in the central nervous system (CNS) and kidney, including glutamate signaling (50), nephrin expression (51), and podocalyxin expression (52), which also serve as the basis of our study. Through these findings, it is understood that brain-derived neurotrophic factor (BDNF), which is primarily produced in the nervous system, is also secreted by the kidneys. To investigate BDNF function in vivo, Endlich et al. knocked down BDNF in zebrafish larvae and found that it led to decreased expression of podocin and nephrin, as well as enlarged Bowman’s spaces, glomerular telangiectasia, and podocyte loss. These structural changes were associated with an increased urinary albumin–creatinine ratio. Based on these findings, BDNF has been suggested as a novel potential biomarker of glomerular kidney injury (53). BDNF is associated with sarcopenia (54), insulin resistance (55), depression (56), and inflammation (57). Because all these adverse conditions are also present in CKD patients, and BDNF is expressed in glomeruli and tubules, Trk receptors (TrkB and TrkC) are expressed in proximal and distal tubules, as well as in collecting duct epithelial cells (58). It can be speculated that BDNF may be a potential marker of CKD. Many researchers have investigated the relationship between depression and BDNF in CKD. Sun et al. showed that the uremic toxin indoxyl sulfate is associated with mood disorders and neurodegeneration and has an inhibitory effect on BDNF expression in unilateral nephrectomized mice (59). Similar results showed that p-cresol sulfate (PCS) levels were increased and BDNF was decreased in C57/BL/6 mice after unilateral nephrectomy, and these changes were often accompanied by depression-like, anxiety-like, and cognitive impairment behaviors (60). However, studies on depression and BDNF have not been consistent, and Alshogran et al. showed that BDNF concentrations did not correlate with depression scores (61). Overall, BDNF may reflect a promising marker for depression screening in CKD. The investigation of BDNF is mainly in the screening of depression, and whether it is a biomarker of CKD or even uremia still needs to be further explored.
At present, the treatment of uremia is scarce and expensive, and the development of new therapeutic drugs is not easy. Therefore, the use of the DSigDB database to find potential therapeutic agents against uremia-related causative genes provides new insights into the treatment of uremia. Importantly, it not only shortens the time but also significantly reduces the cost of developing drugs. Previous studies have shown that uremic toxins may inhibit Klotho expression by promoting increased DNA methyltransferase expression and DNA hypermethylation (62). At the same time, Klotho, as a renoprotective factor (63, 64), is significantly decreased in uremia patients (65). Decitabine prevents early kidney damage by inhibiting DNA methyltransferases, reducing methylation of DNA, and increasing Klotho expression (66). Of course, when it progresses to end-stage renal disease, dialysis is the main treatment, and different dialysis methods will also cause various injuries to patients (67), which is also the direction to be explored in the future.
5 Limitations
Our study has several limitations. Despite including three datasets from the GEO database to mitigate the impact of a single sample, the volume of collected data requires augmentation due to the numerous models we analyzed. Furthermore, while we successfully validated our model’s predictive performance, additional experimental studies are needed to further confirm our biomarkers and mechanisms of action.
6 Conclusions
Our research establishes a novel molecular framework for the early diagnosis of uremia, especially in patients diagnosed with MDD. Furthermore, we have conducted extensive model analyses and identified an optimal diagnostic model, which provides valuable insights for more comprehensive and effective diagnostic gene analysis.
Data availability statement
The original contributions presented in the study are included in the article/supplementary material. Further inquiries can be directed to the corresponding author.
Ethics statement
Ethical approval was not required for the studies involving humans because publicly available datasets were used for this study. The studies were conducted in accordance with the local legislation and institutional requirements. The human samples used in this study were acquired from the NCBI Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo/) data (GSE37171, GSE38750, GSE43484, GSE52790, GSE76826, and GSE98793). Written informed consent to participate in this study was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and the institutional requirements.
Author contributions
KJ: Writing – original draft. CZ: Writing – original draft. CS: Writing – original draft. XF: Writing – original draft. HH: Writing – original draft. BZ: Writing – original draft, Writing – review & editing.
Funding
The authors declare financial support was received for the research and/or publication of this article. The authors declare that the following financial support was received for the publication of the study herein. Yaodong Shenzhou - Pharmaceutical Research Capacity Building Fund Project (2024-KY002-01), Nantong University, Clinical Medicine Special Scientific Research Fund Project (2024LQ019), Young Project of Health Commission of Nantong City(QN2022017).
Acknowledgments
We are grateful to the Second Affiliated Hospital of Nantong University for their support of this work and the public GEO database.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The authors declare that no generative AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
1. Malhi GS and Mann JJ. Depression. Lancet. (2018) 392:2299–312. doi: 10.1016/S0140-6736(18)31948-2
2. Regier DA, Kuhl EA, and Kupfer DJ. The dsm-5: classification and criteria changes. World Psychiatry. (2013) 12:92–8. doi: 10.1002/wps.20050
3. Shorey S, Ng ED, and Wong C. Global prevalence of depression and elevated depressive symptoms among adolescents: a systematic review and meta-analysis. Br J Clin Psychol. (2022) 61:287–305. doi: 10.1111/bjc.12333
4. Greenberg PE, Fournier AA, Sisitsky T, Pike CT, and Kessler RC. The economic burden of adults with major depressive disorder in the United States (2005 and 2010). J Clin Psychiatry. (2015) 76:155–62. doi: 10.4088/JCP.14m09298
5. Zhdanava M, Pilon D, Ghelerter I, Chow W, Joshi K, Lefebvre P, et al. The prevalence and national burden of treatment-resistant depression and major depressive disorder in the United States. J Clin Psychiatry. (2021) 82:20m13699. doi: 10.4088/JCP.20m13699
6. Hsu HJ, Yen CH, Chen CK, Wu IW, Lee CC, Sun CY, et al. Association between uremic toxins and depression in patients with chronic kidney disease undergoing maintenance hemodialysis. Gen Hosp. Psychiatry. (2013) 35:23–7. doi: 10.1016/j.genhosppsych.2012.08.009
8. Vanholder R and Massy ZA. Progress in uremic toxin research: an introduction. Semin Dial. (2009) 22:321–2. doi: 10.1111/j.1525-139X.2009.00573.x
9. Scherer A, Günther OP, Balshaw RF, Hollander Z, Wilson-McManus J, Ng R, et al. Alteration of human blood cell transcriptome in uremia. BMC Med Genomics. (2013) 6:23. doi: 10.1186/1755-8794-6-23
10. De-Nour AK and Czaczkes JW. The influence of patient’s personality on adjustment to chronic dialysis. J Nerv. Ment Dis. (1976) 162:323–33. doi: 10.1097/00005053-197605000-00003
11. De-Nour AK and Czaczkes JW. Bias in assessment of patients on chronic dialysis. J Psychosom. Res. (1974) 18:217–21. doi: 10.1016/0022-3999(74)90025-7
12. Kimmel PL, Peterson RA, Weihs KL, Simmens SJ, Alleyne S, Cruz I, et al. Multiple measurements of depression predict mortality in a longitudinal study of chronic hemodialysis outpatients. Kidney Int. (2000) 57:2093–8. doi: 10.1046/j.1523-1755.2000.00059.x
13. Zeng X, Li C, Li Y, Yu H, Fu P, Hong HGG, et al. A network-based variable selection approach for identification of modules and biomarker genes associated with end-stage kidney disease. Nephrology. (2020) 25:775–84. doi: 10.1111/nep.13655
14. Katon WJ. Epidemiology and treatment of depression in patients with chronic medical illness. Dialogues Clin Neurosci. (2011) 13:7–23. doi: 10.31887/DCNS.2011.13.1/wkaton
15. Zhou Y, Shi W, Zhao D, Xiao S, Wang K, and Wang J. Identification of immune-associated genes in diagnosing aortic valve calcification with metabolic syndrome by integrated bioinformatics analysis and machine learning. Front Immunol. (2022) 13:937886. doi: 10.3389/fimmu.2022.937886
16. Yang Q, Li B, Tang J, Cui X, Wang Y, Li XF, et al. Consistent gene signature of schizophrenia identified by a novel feature selection strategy from comprehensive sets of transcriptomic data. Brief. Bioinform. (2020) 21:1058–68. doi: 10.1093/bib/bbz049
17. Fu J, Tang J, Wang Y, Cui X, Yang Q, Hong J, et al. Discovery of the consistently well-performed analysis chain for swath-ms based pharmacoproteomic quantification. Front Pharmacol. (2018) 9:681. doi: 10.3389/fphar.2018.00681
18. Tang J, Mou M, Wang Y, Luo Y, and Zhu F. Metafs: performance assessment of biomarker discovery in metaproteomics. Brief. Bioinform. (2021) 22:bbaa105. doi: 10.1093/bib/bbaa105
19. Stubbe J, Skov V, Thiesson HC, Larsen KE, Hansen ML, Jensen BL, et al. Identification of differential gene expression patterns in human arteries from patients with chronic kidney disease. Am J Physiol.-Renal Physiol. (2018) 314:F1117–28. doi: 10.1152/ajprenal.00418.2017
20. Al-Chaqmaqchi HA, Moshfegh A, Dadfar E, Paulsson J, Hassan M, Jacobson SH, et al. Activation of wnt/beta-catenin pathway in monocytes derived from chronic kidney disease patients. PLoS One. (2013) 8:e68937. doi: 10.1371/journal.pone.0068937
21. Liu Z, Li X, Sun N, Xu Y, Meng YQ, Yang C, et al. Microarray profiling and co-expression network analysis of circulating lncrnas and mrnas associated with major depressive disorder. PLoS One. (2014) 9:e93388. doi: 10.1371/journal.pone.0093388
22. Miyata S, Kurachi M, Okano Y, Sakurai N, Kobayashi A, Harada K, et al. Blood transcriptomic markers in patients with late-onset major depressive disorder. PLoS One. (2016) 11:e150262. doi: 10.1371/journal.pone.0150262
23. Leday G, Vértes PE, Richardson S, Greene JR, Regan T, Khan S, et al. Replicable and coupled changes in innate and adaptive immune gene expression in two case-control studies of blood microarrays in major depressive disorder. Biol Psychiatry. (2018) 83:70–80. doi: 10.1016/j.biopsych.2017.01.021
24. Leek JT, Johnson WE, Parker HS, Jaffe AE, and Storey JD. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics. (2012) 28:882–3. doi: 10.1093/bioinformatics/bts034
25. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. Limma powers differential expression analyses for rna-sequencing and microarray studies. Nucleic. Acids Res. (2015) 43:e47. doi: 10.1093/nar/gkv007
26. Qin H, Abulaiti A, Maimaiti A, Abulaiti Z, Fan G, Aili Y, et al. Integrated machine learning survival framework develops a prognostic model based on inter-crosstalk definition of mitochondrial function and cell death patterns in a large multicenter cohort for lower-grade glioma. J Transl Med. (2023) 21:588. doi: 10.1186/s12967-023-04468-x
27. Xi YJ, Guo Q, Zhang R, Duan GS, and Zhang SX. Identifying cellular senescence associated genes involved in the progression of end-stage renal disease as new biomarkers. BMC Nephrol. (2023) 24:231. doi: 10.1186/s12882-023-03285-0
28. Smith MD, Hong BA, and Robson AM. Diagnosis of depression in patients with end-stage renal disease. Comparative analysis. Am J Med. (1985) 79:160–6. doi: 10.1016/0002-9343(85)90004-X
29. Halen NV, Cukor D, Constantiner M, and Kimmel PL. Depression and mortality in end-stage renal disease. Curr Psychiatry Rep. (2012) 14:36–44. doi: 10.1007/s11920-011-0248-5
30. Sajda P. Machine learning for detection and diagnosis of disease. Annu Rev Biomed Eng. (2006) 8:537–65. doi: 10.1146/annurev.bioeng.8.061505.095802
31. Su Z, Ning B, Fang H, Hong H, Perkins R, Tong W, et al. Next-generation sequencing and its applications in molecular diagnostics. Expert Rev Mol Diagn. (2011) 11:333–43. doi: 10.1586/erm.11.3
32. Petrik J. Diagnostic applications of microarrays. Transfus. Med. (2006) 16:233–47. doi: 10.1111/j.1365-3148.2006.00673.x
33. Chen Y, Jiang T, Chen P, Ouyang J, Xu G, Zeng Z, et al. Emerging tendency towards autoimmune process in major depressive patients: a novel insight from th17 cells. Psychiatry Res. (2011) 188:224–30. doi: 10.1016/j.psychres.2010.10.029
34. Chung BH, Kim KW, Sun IO, Choi SR, Park HS, Jeon EJ, et al. Increased interleukin-17 producing effector memory t cells in the end-stage renal disease patients. Immunol Lett. (2012) 141:181–9. doi: 10.1016/j.imlet.2011.10.002
35. Turner JE, Paust HJ, Steinmetz OM, and Panzer U. The th17 immune response in renal inflammation. Kidney Int. (2010) 77:1070–5. doi: 10.1038/ki.2010.102
36. Zhang J, Hua G, Zhang X, Tong R, DU X, and Li Z. Regulatory t cells/t-helper cell 17 functional imbalance in uraemic patients on maintenance haemodialysis: a pivotal link between microinflammation and adverse cardiovascular events. Nephrology. (2010) 15:33–41. doi: 10.1111/j.1440-1797.2009.01172.x
37. Csiszar A and Ungvari Z. Synergistic effects of vascular il-17 and tnfalpha may promote coronary artery disease. Med Hypotheses. (2004) 63:696–8. doi: 10.1016/j.mehy.2004.03.009
38. Awasthi A, Murugaiyan G, and Kuchroo VK. Interplay between effector th17 and regulatory t cells. J Clin Immunol. (2008) 28:660–70. doi: 10.1007/s10875-008-9239-7
39. Eisenstein EM and Williams CB. The t(reg)/th17 cell balance: a new paradigm for autoimmunity. Pediatr Res. (2009) 65:26R–31R. doi: 10.1203/PDR.0b013e31819e76c7
40. Hendrikx TK, van Gurp EAFJ, Mol WM, Schoordijk W, Sewgobind VDKD, Ijzermans JNMI, et al. End-stage renal failure and regulatory activities of cd4+cd25bright+foxp3+ t-cells. Nephrol. Dial. Transplant. (2009) 24:1969–78. doi: 10.1093/ndt/gfp005
41. Meng XM, Nikolic-Paterson DJ, and Lan HY. Inflammatory processes in renal fibrosis. Nat Rev Nephrol. (2014) 10:493–503. doi: 10.1038/nrneph.2014.114
42. Kitching AR. Dendritic cells in progressive renal disease: some answers, many questions. Nephrol. Dial. Transplant. (2014) 29:2185–93. doi: 10.1093/ndt/gfu076
43. Tang PM, Nikolic-Paterson DJ, and Lan HY. Macrophages: versatile players in renal inflammation and fibrosis. Nat Rev Nephrol. (2019) 15:144–58. doi: 10.1038/s41581-019-0110-2
44. Liu Z, Guo CG, Dang Q, Wang L, Liu L, Weng S, et al. Integrative analysis from multi-center studies identities a consensus machine learning-derived lncrna signature for stage ii/iii colorectal cancer. Ebiomedicine. (2022) 75:103750. doi: 10.1016/j.ebiom.2021.103750
45. Pirschel W, Mestekemper AN, Wissuwa B, Krieg N, Kröller S, Daniel C, et al. Divergent roles of haptoglobin and hemopexin deficiency for disease progression of shiga-toxin-induced hemolytic-uremic syndrome in mice. Kidney Int. (2022) 101:1171–85. doi: 10.1016/j.kint.2021.12.024
46. Nakashima A, Carrero JJ, Qureshi AR, Miyamoto T, Anderstam B, Barány P, et al. Effect of circulating soluble receptor for advanced glycation end products (srage) and the proinflammatory rage ligand (en-rage, s100a12) on mortality in hemodialysis patients. Clin J Am Soc Nephrol. (2010) 5:2213–9. doi: 10.2215/CJN.03360410
47. Shiotsu Y, Mori Y, Nishimura M, Sakoda C, Tokoro T, Hatta T, et al. Plasma s100a12 level is associated with cardiovascular disease in hemodialysis patients. Clin J Am Soc Nephrol. (2011) 6:718–23. doi: 10.2215/CJN.08310910
48. Shiotsu Y, Mori Y, Nishimura M, Hatta T, Imada N, Maki N, et al. Prognostic utility of plasma s100a12 levels to establish a novel scoring system for predicting mortality in maintenance hemodialysis patients: a two-year prospective observational study in Japan. BMC Nephrol. (2013) 14:16. doi: 10.1186/1471-2369-14-16
49. Yan L, Mathew L, Chellan B, Gardner B, Earley J, Puri TSP, et al. S100/calgranulin-mediated inflammation accelerates left ventricular hypertrophy and aortic valve sclerosis in chronic kidney disease in a receptor for advanced glycation end products-dependent manner. Arterioscler Thromb Vasc Biol. (2014) 34:1399–411. doi: 10.1161/ATVBAHA.114.303508
50. Armelloni S, Li M, Messa P, and Rastaldi MPR. Podocytes: a new player for glutamate signaling. Int J Biochem Cell Biol. (2012) 44:2272–7. doi: 10.1016/j.biocel.2012.09.014
51. Li M, Armelloni S, Ikehata M, Corbelli A, Pesaresi M, Calvaresi N, et al. Nephrin expression in adult rodent central nervous system and its interaction with glutamate receptors. J Pathol. (2011) 225:118–28. doi: 10.1002/path.2923
52. Vitureira N, Andrés R, Pérez-Martínez E, Martínez A, Bribián A, Blasi J, et al. Podocalyxin is a novel polysialylated neural adhesion rotein with multiple roles in neural development and synapse formation. PLoS One. (2010) 5:e12003. doi: 10.1371/journal.pone.0012003
53. Endlich N, Lange T, Kuhn J, Klemm P, Kotb AM, Siegerist F, et al. BDNF: mRNA expression in urine cells of patients with chronic kidney disease and its role in kidney function. J Cell Mol Med. (2018) 22:5265–77. doi: 10.1111/jcmm.13762
54. Karim A, Iqbal MS, Muhammad T, and Qaisar R. Evaluation of sarcopenia using biomarkers of the neuromuscular junction in Parkinson’s disease. J Mol Neurosci. (2022) 72:820–9. doi: 10.1007/s12031-022-01970-7
55. Rozanska O, Uruska A, and Zozulinska-Ziolkiewicz D. Brain-derived neurotrophic factor and diabetes. Int J Mol Sci. (2020) 21:841. doi: 10.3390/ijms21030841
56. Brunoni AR, Lopes M, and Fregni F. A systematic review and metaanalysis of clinical studies on major depression and BDNF levels: implications for the role of neuroplasticity in depression. Int J Neuropsychopharmacol. (2008) 11:1169–80. doi: 10.1017/S1461145708009309
57. Laste G, Ripoll Rozisky J, de Macedo IC, Dos Santos VS, Custódio de Souza IC, Caumo W, et al. Spinal cord brainderived neurotrophic factor levels increase after dexamethasone treatment in male rats with chronic infammation. NeuroImmunoModulation. (2013) 20:119–25. doi: 10.1159/000345995
58. Huber LJ, Hempstead B, and Donovan MJ. Neurotrophin and neurotrophin receptors in human fetal kidney. Dev Biol. (1996) 179:369–81. doi: 10.1006/dbio.1996.0268
59. Sun CY, Li JR, Wang YY, Lin SY, Ou YC, Lin CJ, et al. Indoxyl sulfate caused behavioral abnormality and neurodegeneration in mice with unilateral nephrectomy. Aging (Albany NY). (2021) 13:6681–701. doi: 10.18632/aging.202523
60. Sun CY, Li JR, Wang YY, Lin SY, Ou YC, Lin CJ, et al. p-Cresol sulfate caused behavior disorders and neurodegeneration in mice with unilateral nephrectomy involving oxidative stress and neuroinfammation. Int J Mol Sci. (2020) 21:6687. doi: 10.3390/ijms21186687
61. Alshogran OY, Khalil AA, Oweis AO, Altawalbeh SM, and Alqudah MAY. Association of brain-derived neurotrophic factor and interleukin-6 serum levels with depressive and anxiety symptoms in hemodialysis patients. Gen Hosp Psychiatry. (2018) 53:25–31. doi: 10.1016/j.genhosppsych.2018.04.003
62. Sun CY, Chang SC, and Wu MS. Suppression of klotho expression by protein-bound uremic toxins is associated with increased dna methyltransferase expression and dna hypermethylation. Kidney Int. (2012) 81:640–50. doi: 10.1038/ki.2011.445
63. Sugiura H, Yoshida T, Tsuchiya K, Mitobe M, Nishimura S, Shirota S, et al. Klotho reduces apoptosis in experimental ischaemic acute renal failure. Nephrol. Dial. Transplant. (2005) 20:2636–45. doi: 10.1093/ndt/gfi165
64. Haruna Y, Kashihara N, Satoh M, Tomita N, Namikoshi T, Sasaki T, et al. Amelioration of progressive renal injury by genetic manipulation of klotho gene. Proc Natl Acad Sci U. S. A. (2007) 104:2331–6. doi: 10.1073/pnas.0611079104
65. Koh N, Fujimori T, Nishiguchi S, Tamori A, Shiomi S, Nakatani T, et al. Severely reduced production of klotho in human chronic renal failure kidney. Biochem Biophys Res Commun. (2001) 280:1015–20. doi: 10.1006/bbrc.2000.4226
66. Zhao Y, Zeng X, Xu X, Wang W, Xu L, Wu Y, et al. Low-dose 5-aza-2’-deoxycytidine protects against early renal injury by increasing klotho expression. Epigenomics. (2022) 14:1411–25. doi: 10.2217/epi-2022-0430
Keywords: uremia, major depressive disorder, machine learning, diagnostic models, bioinformatics
Citation: Jiang K, Zhang C, Shen C, Fang X, Huang H and Zheng B (2025) Development and validation of a comprehensive machine learning framework for a diagnostic model of uremia based on genes involved in major depressive disorder. Front. Nephrol. 5:1576349. doi: 10.3389/fneph.2025.1576349
Received: 13 February 2025; Accepted: 18 September 2025;
Published: 02 October 2025.
Edited by:
Wenlin Yang, University of Florida, United StatesReviewed by:
Alessandro Domenico Quercia, Nephrology and Dialysis ASLCN1, ItalyShinsuke Hidese, Teikyo University, Japan
Copyright © 2025 Jiang, Zhang, Shen, Fang, Huang and Zheng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Bing Zheng, bnR6YjIwMDhAMTYzLmNvbQ==