AUTHOR=Wang Liuding , Shi Jingzi , Miao Lina , Chen Yifan , Wei Jingjing , Jia Min , Gong Zhiyi , Yang Ze , Lyu Jian , Zhang Yunling , Liang Xiao TITLE=Predicting new-onset stroke with machine learning: development of a model integrating traditional Chinese and western medicine JOURNAL=Frontiers in Pharmacology VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/pharmacology/articles/10.3389/fphar.2025.1546878 DOI=10.3389/fphar.2025.1546878 ISSN=1663-9812 ABSTRACT=IntroductionThe integration of traditional Chinese medicine (TCM) and Western medicine has demonstrated effectiveness in the primary prevention of stroke. Therefore, our study aims to utilize TCM syndromes alongside conventional risk factors as predictive variables to construct a machine learning model for assessing the risk of new-onset stroke.MethodsWe conducted a ten-year follow-up study encompassing 4,511 participants from multiple Chinese community hospitals. The dependent variable was the occurrence of the new-onset stroke, while independent variables included age, gender, systolic blood pressure (SBP), diabetes, blood lipids, carotid atherosclerosis, smoking status, and TCM syndromes. We developed the models using XGBoost in conjunction with SHapley Additive exPlanations (SHAP) for interpretability, and logistic regression with a nomogram for clinical application.ResultsA total of 1,783 individuals were included (1,248 in the training set and 535 in the validation set), with 110 patients diagnosed with new-onset stroke. The logistic model demonstrated an AUC of 0.746 (95% CI: 0.719–0.774) in the training set and 0.658 (95% CI: 0.572–0.745) in the validation set. The XGBoost model achieved a training set AUC of 0.811 (95% CI: 0.788–0.834) and a validation set AUC of 0.628 (95% CI: 0.537–0.719). SHAP analysis showed that elevated SBP, Fire syndrome in TCM, and carotid atherosclerosis were the three most important features for predicting the new-onset stroke.ConclusionUnder identical traditional risk factors, Chinese residents with Fire syndrome may have a higher risk of new-onset stroke. In high-risk populations for stroke, it is recommended to prioritize the screening and management of hypertension, Fire syndrome, and carotid atherosclerosis. However, future high-performance TCM predictive models require more objective and larger datasets for optimization.