Skip to main content

ORIGINAL RESEARCH article

Front. Med.
Sec. Precision Medicine
Volume 11 - 2024 | doi: 10.3389/fmed.2024.1407376

Enhanced Feature Selection and Ensemble Learning for Cardiovascular Disease Prediction: Hybrid GOL2-2T and Adaptive Boosted Decision Fusion with Babysitting Refinement Provisionally Accepted

 S Phani Praveen1 Mohammad Kamrul Hasan2  Chan Yeob Yeun3* Siti Norul Huda Sheikh Abdullah2 Shayla Islam4  Uddagiri Sirish1 N S Koti Mani Kumar Tirumanadham5 Fatima Rayan Awad Ahmed6  Thowiba Elawad Ahmed7 Ayman Afrin Noboni8  Gabriel Avelino Sampedro9  Taher M Ghazal3
  • 1Prasad V. Potluri Siddhartha Institute of Technology, India
  • 2National University of Malaysia, Malaysia
  • 3Khalifa University, United Arab Emirates
  • 4UCSI University, Malaysia
  • 5Sir C.R.Reddy College of Engineering, India
  • 6Prince Sattam Bin Abdulaziz University, Saudi Arabia
  • 7Imam Abdulrahman Bin Faisal University, Saudi Arabia
  • 8Medical College for Women and Hospital, Bangladesh
  • 9University of the Philippines Open University, Philippines

The final, formatted version of the article will be published soon.

Receive an email when it is updated
You just subscribed to receive the final version of the article

Cardiovascular disease is a major global health concern that necessitates accurate and efficient diagnostic tools. In order to reduce death rates and improve the forecast accuracy of cardiac disease, this work introduces a novel machine learning approach. To address data-related challenges, the approach that is being suggested makes use of Multivariate Imputation by Chained Equations (MICE), Interquartile Range (IQR) outlier detection, and Synthetic Minority Over-sampling Technique (SMOTE). Hybrid (2-Tier Grasshopper Optimization with L2 regularization) named as GOL2-2T is a new hybrid feature selection method that we provide in this work. This method enhances the feature selection process by combining the Grasshopper Optimization Algorithm (GOA) with L2 regularization. We use Adaptive Boosted Decision Fusion (ABDF) ensemble learning with a babysitting algorithm. Our model has 83.0% accuracy and 84.0% balanced F1 score, significantly better than previous methods. Our heart disease prediction method is successful using accuracy, recall, and AUC score. This research helps build reliable diagnostic methods that allow doctors to detect cardiovascular illness early and treat patients effectively.

Keywords: Multivariate imputation by chained equations, Synthetic minority over-sampling technique, Interquartile range, Adaptive Boosted Decision Fusion, cardiovascular disease

Received: 26 Mar 2024; Accepted: 20 May 2024.

Copyright: © 2024 Praveen, Hasan, Yeun, Abdullah, Islam, Sirish, Tirumanadham, Ahmed, Ahmed, Noboni, Sampedro and Ghazal. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Dr. Chan Yeob Yeun, Khalifa University, Abu Dhabi, United Arab Emirates