Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Cardiovasc. Med.

Sec. Heart Failure and Transplantation

Volume 12 - 2025 | doi: 10.3389/fcvm.2025.1613577

This article is part of the Research TopicArtificial Intelligence Algorithms and Cardiovascular Disease Risk AssessmentView all 11 articles

Machine learning-based screening of heart failure using the integrated features of electrocardiogram and phonocardiogram: A multicenter study in China

Provisionally accepted
Junjie  BianJunjie Bian1Kok-Han  CheeKok-Han Chee1Chengyu  LiuChengyu Liu2Hongwei  SunHongwei Sun3Shixi  ZhangShixi Zhang4Peili  ChenPeili Chen5Hua-Nong  TingHua-Nong Ting1*
  • 1University of Malaya, Kuala Lumpur, Malaysia
  • 2Southeast University, Nanjing, Jiangsu Province, China
  • 3Hefei BOE Hospital, Hefei, Anhui Province, China
  • 4Shangqiu Municipal Hospital, Shangqiu, China
  • 5First People's Hospital of Shangqiu, Shangqiu, Henan Province, China

The final, formatted version of the article will be published soon.

Backgrounds: Heart failure (HF) is a major health concern associated with poor prognosis, and there is an urgent clinical need for an easy and accurate method for screening HF. This multicenter study aims to validate a novel AI-based phono-electrocardiogram algorithm (AI-PECG) in early HF detection. Methods: A total of 1,017 individuals were grouped into a training cohort and an external validating cohort, with a ratio of 8:2. In the training cohort, data of patients were further split into training set and test set randomly with the 8:2 ratio. The least absolute shrinkage and selection operator with five-fold cross-validation was utilized for dimensionality reduction and selection of features for model construction from clinical variables, phonocardiogram (PCG) parameters and electrocardiogram (ECG) parameters. Five machine learning (ML) algorithms were then carried out to choose a classifier model with the optimal recognition of HF, including logistic regression, random forest, eXtreme Gradient Boosting, Category Boosting (CatBoost), and Naive Bayes. The importance of ranking predicted factors was calculated in the final screening model using the SHapley Additive exPlanations analysis. Results: Among eligible participants, 302 reported HF. Totally 17 variables were selected to conduct the screening models. In the training set, the area under the curve (AUC) of the CatBoost model was 0.998 (95% confidence interval (CI): 0.996-1.000), which was higher compared to that of other ML models. The sensitivity and specificity of CatBoost model was 0.989 (95% CI: 0.978-0.996) and 0.989 (95% CI: 0.979-0.999). In the screening model, top 5 factors in terms of importance were EMAT, lymphocyte, LVST, CRP, and platelet. Conclusion: The ML model incorporating general data alongside ECG and PCG features

Keywords: Heart Failure, Phonocardiogram, electrocardiogram, machine learning, Prediction model

Received: 17 Apr 2025; Accepted: 21 Oct 2025.

Copyright: © 2025 Bian, Chee, Liu, Sun, Zhang, Chen and Ting. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Hua-Nong Ting, tinghn@outlook.com

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.