AUTHOR=Zibibula Yierpan , Tayier Gulifeire , Maimaiti Aierpati , Liu Tianze , Lu Jinshuai TITLE=Machine learning approaches to identify the link between heavy metal exposure and ischemic stroke using the US NHANES data from 2003 to 2018 JOURNAL=Frontiers in Public Health VOLUME=Volume 12 - 2024 YEAR=2024 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2024.1388257 DOI=10.3389/fpubh.2024.1388257 ISSN=2296-2565 ABSTRACT=Purpose: There is limited understanding of the link between exposure to heavy metals and ischemic stroke (IS). This research aimed to develop efficient and interpretable machine learning (ML) models to associate the relationship between heavy metals' exposure and IS.The data of this research was obtained from the National Health and Nutrition Examination Survey (US NHANES, 2003-2018) database. Seven ML models were used to identify IS by heavy metals' exposure. Ten-fold cross-validation, the area under the curve (AUC), F1 scores, brier scores, Matthews correlation coefficient (MCC), precision-recall curves, and DCA curves were utilized to test the strength of the models. Then, the best-performing model was selected. Finally, used the DALEX package for feature explanation and decision-making visualization.Results: A total of 15575 candidates were involved in this study. The best-performing ML models including logistic regression(AUC : 0.796), and XGboost (AUC : 0.789)were selected. The value of DALEX revealed that age, total mercury in blood, PIR, and cadmium contributed most to IS in the logistic regression and XGboost model.The logistic regression and XGboost models showed high efficiency, accuracy, and robustness in identifying associations between heavy metal exposures and IS in NHANES 2003-2018 participants in the United States.