AUTHOR=Tilwani Deepa , O'Reilly Christian , Riccardi Nicholas , Shalin Valerie L. , den Ouden Dirk-Bart , Fridriksson Julius , Shinkareva Svetlana V. , Sheth Amit P. , Desai Rutvik H. TITLE=Benchmarking machine learning models in lesion-symptom mapping for predicting language outcomes in stroke survivors JOURNAL=Frontiers in Neuroimaging VOLUME=Volume 4 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/neuroimaging/articles/10.3389/fnimg.2025.1573816 DOI=10.3389/fnimg.2025.1573816 ISSN=2813-1193 ABSTRACT=Several decades of research have investigated the neural connections between stroke-induced brain damage and language difficulties. Typically, lesion-symptom mapping (LSM) studies that address this connection have relied on mass univariate statistics, which do not account for multidimensional relationships between variables. Machine learning (ML) techniques, which can capture these intricate connections, offer a promising complement to LSM methods. To test this promise, we benchmarked ML models on structural and functional MRI to predict aphasia severity (N = 238) and naming impairment (N = 191) for a cohort of chronic-stage stroke survivors. We used nested cross-validation to examine performance along three dimensions: (1) parcellation schemes (JHU, AAL, BRO, and AICHA atlases), (2) neuroimaging modalities (resting-state functional connectivity, structural connectivity, mean diffusivity, fractional anisotropy, and lesion location) and (3) ML methods (Random Forest, Support Vector Regression, Decision Tree, K Nearest Neighbors, and Gradient Boosting). The best results were obtained by combining the JHU atlas, lesion location, and the Random Forest model. This combination yielded moderate to high correlations with the two different behavioral scores. Key regions identified included several perisylvian areas and pathways within the language network. This work complements existing LSM methods with new tools for improving the prediction of language outcomes in stroke survivors.