ORIGINAL RESEARCH article

Front. Big Data

Sec. Machine Learning and Artificial Intelligence

Volume 8 - 2025 | doi: 10.3389/fdata.2025.1634133

This article is part of the Research TopicAdvanced Machine Learning Techniques for Single or Multi-Modal Information ProcessingView all articles

Basrah Score: A Novel Machine Learning-Based Score for Differentiating Iron Deficiency Anemia and Beta Thalassemia Trait Using RBC Indices

Provisionally accepted
Salma  MahmoodSalma Mahmood1*Asaad  Abdulameer KhalafAsaad Abdulameer Khalaf2Saad  S HamadiSaad S Hamadi1
  • 1University of Basrah, Basra, Iraq
  • 2Basrah Oncology and Hematology center, Basrah, Iraq., Basrah, Iraq

The final, formatted version of the article will be published soon.

Iron deficiency anemia (IDA) and beta-thalassemia trait (BTT) are prevalent causes of microcytic anemia, often presenting overlapping hematological features that pose diagnostic challenges and necessitate prompt and precise management. Traditional discrimination indices-such as the Mentzer Index, Ihsan's formula, and the England and Fraser criteria-have been extensively applied in both research and clinical settings; however, their diagnostic performance varies considerably across different populations and datasets. This study proposes a novel and interpretable diagnostic model, the Basrah Score, developed using Elastic Net Logistic Regression (ENLR). This machine learning-based approach yields a flexible discrimination function that adapts to variations in clinical and environmental factors. The model was trained and validated on a local dataset of 2,120 individuals (1,080 with IDA and 1,040 with BTT), and was benchmarked against eight conventional indices. The Basrah Score demonstrated superior diagnostic performance, with an accuracy of 96.7%, a sensitivity of 95.0%, and a specificity of 98.6%. These results underscore the importance of incorporating advanced preprocessing techniques, class balancing, hyperparameter optimization, and rigorous crossvalidation to ensure the robustness of diagnostic models. Overall, this research highlights the potential of integrating interpretable machine learning models with established clinical parameters to improve diagnostic accuracy in hematological disorders, particularly in resource-constrained settings.

Keywords: Beta Thalassemia, iron deficiency anemia, Elastic Net Logistic Regression (ENLR), Machine learning Discrimination Indices, hematological parameters

Received: 23 May 2025; Accepted: 14 Jul 2025.

Copyright: © 2025 Mahmood, Khalaf and Hamadi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Salma Mahmood, University of Basrah, Basra, Iraq

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.