ORIGINAL RESEARCH article

Front. Bioinform.

Sec. Integrative Bioinformatics

Volume 5 - 2025 | doi: 10.3389/fbinf.2025.1594971

This article is part of the Research TopicClinical prediction models in cancer through bioinformaticsView all 11 articles

Identification of immune and major depressive disorder-related diagnostic markers for early nonalcoholic fatty liver disease by WGCNA and machine learning

Provisionally accepted
Yuyun  JiaYuyun Jia*Yanping  CaoYanping CaoQin  YinQin YinXueqian  LiXueqian LiXiu  WenXiu Wen
  • Department of Pathology, Nanjing Drum Tower Hospital, Nanjing, China

The final, formatted version of the article will be published soon.

Major depressive disorder (MDD) and nonalcoholic fatty liver disease (NAFLD) are highly prevalent conditions that exhibit significant pathophysiological overlap, particularly in metabolic and immune pathways.This study aims to bridge this gap by integrating transcriptomic data from publicly available repositories and advanced machine learning algorithms to identify novel biomarkers and construct a predictive model for early-stage NAFLD in MDD patients.We systematically analyzed transcriptomic data of simple steatosis (SS), nonalcoholic steatohepatitis (NASH), and major depressive disorder (MDD) from GEO databases to construct and validate a diagnostic model. After removing batch effects, we identified differentially expressed genes (DEGs) that distinguished disease and control groups. We further applied Weighted Gene Co-expression Network Analysis (WGCNA) to identify immune-related genes in SS/NASH patients versus controls. The intersection of shared DEGs across both conditions and WGCNA-identified genes was determined and subjected to functional enrichment analysis. Immune cell infiltration levels were quantified using single-sample gene set enrichment analysis (ssGSEA). A predictive model for SS/NASH was developed by evaluating nine machine-learning algorithms with 10-fold cross-validation on the datasets.Fourteen genes strongly linked to both the immune system and the two conditions were identified. Immune cell infiltration profiling revealed distinct immune landscapes in patients versus healthy controls. Moreover, an eight-gene signature was developed, demonstrating superior diagnostic accuracy in both testing and training cohorts. Notably, these eight genes were found to correlate with the severity of early-stage NAFLD.This study established a predictive model for early-stage NAFLD through the integration of bioinformatics and machine learning approaches, with a focus on immune-and MDD-related genes. The eight-gene signature identified in this study represents a novel diagnostic tool for precision medicine.

Keywords: Major Depressive Disorder, nonalcoholic fatty liver disease, Simple steatosis, nonalcoholic steatohepatitis, machine learning, weighted gene co-expression network analysis

Received: 17 Mar 2025; Accepted: 18 Jun 2025.

Copyright: © 2025 Jia, Cao, Yin, Li and Wen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Yuyun Jia, Department of Pathology, Nanjing Drum Tower Hospital, Nanjing, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.