ORIGINAL RESEARCH article
Front. Med.
Sec. Precision Medicine
This article is part of the Research TopicSmart Prevention and Precision Care: Machine Learning in Cardiometabolic and Oncologic DiseasesView all articles
Padding Interpolation, Median Imputation, RobustScalar, and Particle Swarm Optimization with Heterogeneous Classifiers: A Robust Combination for Effective Heart Disease Diagnosis
Provisionally accepted- 1Graphic Era Deemed to be University, Dehradun, India
- 2Indian Institute of Technology Mandi, Mandi, India
- 3Sant Longowal Institute of Engineering and Technology, Longowal, India
- 4Chitkara University Chitkara Design School, Rajpura, India
- 5Chandigarh University, Sahibzada Ajit Singh Nagar, India
- 6King Khalid University, Abha, Saudi Arabia
- 7Princess Nourah bint Abdulrahman University Central Library, Riyadh, Saudi Arabia
- 8Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
Select one of your emails
You have multiple emails registered with Frontiers:
Notify me on publication
Please enter your email address:
If you already have an account, please login
You don't have a Frontiers account ? You can register here
Heart disease (HD) remains a leading global cause of mortality, underscoring the critical need for early and accurate diagnosis. While machine learning (ML) offers promising avenues for automated diagnosis, many existing models suffer from data quality issues, suboptimal feature selection, and a lack of robustness. This study introduces a novel, robust diagnostic framework that synergistically combines advanced data preprocessing with an improved hybrid optimization-classification model. Our framework employs Padding Interpolation for handling missing values, Median Imputation for outlier correction, and RobustScalar for feature scaling to ensure data integrity. A key novelty is the introduction of an Improved Particle Swarm Optimization (IPSO) algorithm, which integrates a dynamic inertia weight and a mutation operator to enhance global search capability and prevent premature convergence. This IPSO is used for dual purposes: optimal feature selection and hyperparameter tuning of five heterogeneous classifiers (LR, LDA, GNB, SVC, XGBoost). Extensive experiments on a composite dataset from five public repositories demonstrate the superiority of our approach. The proposed Model 5 (IPSO-XGBoost) achieved state-of-the-art performance at a 90:10 train-test ratio, with an accuracy of 91.3%, sensitivity of 88.37%, specificity of 93.88%, precision of 92.68%, F1-score of 90.48%, and a remarkably high Diagnostic Odds Ratio (DOR) of 116.53. Statistical significance tests (p-value < 0.05) confirmed that the performance improvements over baseline models and existing methods are not due to random chances. The model also demonstrated consistent performance on independent Cleveland and Statlog validation datasets, proving its generalizability. This work establishes a comprehensive, robust, and highly effective pipeline for heart disease diagnosis.
Keywords: Machine learning classification, Heart disease, Particle Swarm Optimization, interpolation, Imputation
Received: 09 Oct 2025; Accepted: 08 Dec 2025.
Copyright: © 2025 DHANKA, Kumar, Maini, Kumar, Singh, Khan, Abbas and Ksibi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Mudassir Khan
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
