AUTHOR=Mai Peipei , Huo Huanhuan , Li Xiaona , Zhou Dingwen , He Fang , Li Yongxin , Wang Hua TITLE=A new integrated machine learning model: application to improve the accuracy of predicting left atrial appendage thrombus in patients with non-valvular atrial fibrillation JOURNAL=Frontiers in Medicine VOLUME=Volume 12 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2025.1661696 DOI=10.3389/fmed.2025.1661696 ISSN=2296-858X ABSTRACT=BackgroundNon-Valvular Atrial fibrillation (NVAF) and atrial flutter are significant contributors to left atrial appendage thrombus (LAAT) formation. This study explores the potential of machine learning (ML) models integrating transthoracic echocardiography (TTE) and clinical data for non-invasive LAAT detection and risk assessment.MethodsA total of 698 patients with NVAF was recruited from Luoyang Central Hospital between January 2021 and May 2024, including 558 patients for retrospective analysis and 140 for prospective validation. Based on transesophageal echocardiography (TEE) results, patients were categorized into three groups: non-thrombotic, blood stasis, and thrombotic. Four ML algorithms—Random Forest, Logistic Regression (LR), Support Vector Machine, and XGBoost—were developed using TTE and clinical data to predict LAAT.ResultsUnivariate analysis identified significant predictors of LAAT, including permanent AF, heart failure, BNP, uric acid, D-dimer, mitral regurgitation, LVEF, LVED, LAD, CHA₂DS₂-VASc score, and LAA velocity (p < 0.05). The combined TTE data model outperformed independent TTE-based models but was slightly less accurate than the TEE model. Among ML algorithms, the LR model demonstrated the best performance, achieving an area under the curve (AUC) of 80.9% in the test set and 78.7% in prospective validation for the thrombotic state group. For the thrombotic group, the LR model achieved an AUC of 80.0%, closely approaching the TEE model’s 84.0%.ConclusionThe LR model provides a reliable non-invasive approach for LAAT screening in high-risk AF patients by integrating TTE features with clinical data, potentially reducing reliance on TEE.