ORIGINAL RESEARCH article
Front. Psychiatry
Sec. Computational Psychiatry
Volume 16 - 2025 | doi: 10.3389/fpsyt.2025.1630922
This article is part of the Research TopicMachine Learning Algorithms and Software Tools for Early Detection and Prognosis of SchizophreniaView all 6 articles
Diagnosing Schizophrenia with Routine Blood Tests: A Comparative Analysis of Machine Learning Algorithms
Provisionally accepted- 1Serdivan State Hospital, Sakarya, Türkiye
- 2Hitit University Çorum Erol Olçok Training and Research Hospital, Çorum, Türkiye
- 3Faculty of Computer and Information Sciences, Sakarya University, Sakarya, Türkiye
- 4Tarsus Devlet Hastanesi, Mersin, Türkiye
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Schizophrenia is a severe mental disorder with a prevalence of approximately 1% in the general population, diagnosed primarily based on clinical criteria. The absence of objective diagnostic methods and reliable biomarkers presents significant challenges for accurate diagnosis and effective treatment. Recently, peripheral blood biomarkers have gained attention for their potential in understanding schizophrenia's biological basis. Concurrently, machine learning algorithms are increasingly used to analyze high-dimensional data to improve diagnostic accuracy and predict treatment responses. This retrospective case-control study aimed to develop a practical, costeffective, and high-performance classification model differentiating schizophrenia patients from healthy individuals using routine hematological and biochemical parameters. Conducted at a tertiary care hospital, the study included 203 schizophrenia patients treated within five years and 192 age-and sex-matched healthy controls. Demographic data and routine blood parameters (complete blood count and biochemical panels) were extracted from medical records. Missing data exceeding 85% per variable were excluded, and remaining missing values were imputed after train-test splitting to prevent data leakage. Optimal biomarker subsets were identified using the Grey Wolf Optimization (GWO) algorithm. Subsequently, five machine learning models-Random Forest (RF), XGBoost, Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and Logistic Regression (LR)-were trained and evaluated via stratified 10-fold cross-validation. Groups were homogeneous regarding age and sex distribution. Prior to GWO optimization, the highest accuracies were obtained by XGBoost (95.55%) and Random Forest (94.63%). Postoptimization, both models maintained robust performance: Random Forest improved accuracy (94.95%) and recall (96.25%), whereas XGBoost achieved the highest accuracy (95.90%) and strong specificity (95.54%). The highest Area Under the Curve (AUC) values post-optimization were achieved by XGBoost (0.96) and Random Forest (0.95), indicating strong diagnostic capability. Key biomarkers distinguishing schizophrenia included total protein, glucose, iron, creatine kinase, total bilirubin, uric acid, calcium, and sodium, with schizophrenia patients showing notably lower glucose levels compared to controls, contrary to typical findings. Differences in triglycerides, liver enzymes, sodium, and potassium lacked clear clinical significance. The developed models achieved diagnostic accuracy comparable to studies using more costly biomarkers, highlighting their potential clinical practicality and economic advantages. External validation is recommended to confirm the generalizability of these results.
Keywords: Schizophrenia, machine learning, biomarkers, Grey Wolf optimization (GWO), Blood parameters
Received: 19 May 2025; Accepted: 26 Jul 2025.
Copyright: © 2025 Ogur, Erdogan Kaya, Oğur and Erdogan Akturk. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Nur Banu Oğur, Faculty of Computer and Information Sciences, Sakarya University, Sakarya, Türkiye
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.