AUTHOR=Lv Jiankang , Wang Xueru , Qin Wei TITLE=Integrated bioinformatics and machine learning identify S100A9 and VGLL1 as hub genes for schizophrenia JOURNAL=Frontiers in Psychiatry VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/psychiatry/articles/10.3389/fpsyt.2025.1621219 DOI=10.3389/fpsyt.2025.1621219 ISSN=1664-0640 ABSTRACT=BackgroundSchizophrenia (SCZ) is a debilitating neuropsychiatric disorder with unclear etiology, involving complex interactions between genetic and environmental factors. Current diagnostic methods rely on subjective clinical assessments, and existing treatments often fail to address cognitive and negative symptoms adequately. Identifying key biomarkers for SCZ is crucial for improving diagnosis and developing targeted therapies.MethodsThis study integrated bioinformatics analysis and machine learning approaches to identify potential biomarkers for SCZ. Transcriptomic data from five independent cohorts were obtained from the GEO database. Differential expression analysis and Robust Rank Aggregation (RRA) were used to identify significant differentially expressed genes (DEGs). Protein-protein interaction (PPI) network, Least Absolute Shrinkage and Selection Operator (Lasso) regression and Random Forest (RF) were employed to screen for hub genes. The diagnostic model was constructed using logistic regression. The receiver operating characteristic (ROC) curve was used to evaluate diagnostic accuracy of the model, and nomograms and calibration curves were performed to evaluate their clinical applicability. Functional enrichment analyses and single-sample Gene Set Enrichment Analysis (ssGSEA) were conducted to explore the underlying mechanisms of the identified hub genes.ResultsS100A9 and VGLL1 were determined as potential diagnostic biomarkers for SCZ. The diagnostic model demonstrated robust diagnostic performance in the training cohorts (AUC = 0.806) and external validation cohorts (AUC = 0.702, 0.666 and 0.739). Functional enrichment analyses revealed that DEGs related to VGLL1 and S100A9 were primarily involved in immune system regulation and signaling pathways such as PI3K-Akt signaling pathway. ssGSEA showed significant increases in the infiltration levels of five immune cell types (CD56bright natural killer cells, MDSCs, mast cells, natural killer cells, and plasmacytoid dendritic cells) in SCZ patients, with strong positive correlations between S100A9 and these immune cell infiltrations.ConclusionOur study identified S100A9 and VGLL1 as potential biomarkers for SCZ, highlighting their roles in immune regulation. These findings provide new insights into the pathogenesis of SCZ and suggest potential diagnostic targets.