ORIGINAL RESEARCH article

Front. Public Health

Sec. Environmental Health and Exposome

Volume 13 - 2025 | doi: 10.3389/fpubh.2025.1582779

Developing Machine Learning Models for Predicting Cardiovascular Disease Survival Based on Heavy Metal Serum and Urine Levels

Provisionally accepted
Hui  JinHui Jin1Ya  XuYa Xu2*Ling  ZhangLing Zhang3Man  LuoMan Luo4*Ya  XuYa Xu2*
  • 1West China School of Medicine, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China
  • 2Nanjing Jiangbei Hospital, Affiliated Nanjing Jiangbei Hospital of Xinglin College, Nantong University, Jiangsu, China
  • 3Huai’an No. 3 People's Hospital, Huaian, China
  • 4Huai’an TCM Hospital Affiliated to Nanjing University of Chinese Medicine, Jiangsu, China

The final, formatted version of the article will be published soon.

Environmental exposure to heavy metals, such as arsenic, cadmium, and lead, is a known risk factor for cardiovascular diseases.We aim to examine the associations between heavy metal exposure and the mortality of patients with cardiovascular diseases.We analyzed data from the NHANES 2003-2018, including urine and blood metal concentrations from 4924 participants. Five machine learning models-CoxPHSurvival, FastKernelSurvivalSVM, GradientBoostingSurvival, RandomSurvivalForest, and ExtraSurvivalTrees-were used to predict cardiovascular mortality. Model performance was assessed with the concordance index (C-index), integrated Brier score, time-dependent AUC, and calibration curves. SHAP analysis was conducted using a reduced background dataset created via K-means clustering.GradientBoostingSurvival (GBS) showed the best performance for hypertension (Cindex: 0.780, mean AUC: 0.798). RandomSurvivalForest (RSF) was the top model for coronary heart disease (C-index: 0.592, mean AUC: 0.626) and myocardial infarction (C-index: 0.705, mean AUC: 0.743), while CoxPHSurvival excelled for heart failure (Cindex: 0.642, mean AUC: 0.672) and stroke (C-index: 0.658, mean AUC: 0.691).ExtraSurvivalTrees performed best in angina (C-index: 0.652, mean AUC: 0.669).Calibration curves confirmed the models' accuracy. SHAP analysis identified age as the most influential factor, with heavy metals like lead, cadmium, and thallium significantly contributing to risk. A user-friendly web calculator was developed for individualized survival predictions.Machine learning models, including GradientBoostingSurvival, RandomSurvivalForest, CoxPHSurvival, and ExtraSurvivalTrees, demonstrated strong performance in predicting mortality risk for various cardiovascular diseases. Key metals were identified as significant risk factors in cardiovascular risk assessment.

Keywords: Interpretable machine learning, heavy metals, cardiovascular disease, Mortality, Shap

Received: 24 Feb 2025; Accepted: 30 Apr 2025.

Copyright: © 2025 Jin, Xu, Zhang, Luo and Xu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Ya Xu, Nanjing Jiangbei Hospital, Affiliated Nanjing Jiangbei Hospital of Xinglin College, Nantong University, Jiangsu, China
Man Luo, Huai’an TCM Hospital Affiliated to Nanjing University of Chinese Medicine, Jiangsu, China
Ya Xu, Nanjing Jiangbei Hospital, Affiliated Nanjing Jiangbei Hospital of Xinglin College, Nantong University, Jiangsu, China

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.