AUTHOR=Tonneau Marion , Phan Kim , Manem Venkata S. K. , Low-Kam Cecile , Dutil Francis , Kazandjian Suzanne , Vanderweyen Davy , Panasci Justin , Malo Julie , Coulombe François , Gagné Andréanne , Elkrief Arielle , Belkaïd Wiam , Di Jorio Lisa , Orain Michele , Bouchard Nicole , Muanza Thierry , Rybicki Frank J. , Kafi Kam , Huntsman David , Joubert Philippe , Chandelier Florent , Routy Bertrand TITLE=Generalization optimizing machine learning to improve CT scan radiomics and assess immune checkpoint inhibitors’ response in non-small cell lung cancer: a multicenter cohort study JOURNAL=Frontiers in Oncology VOLUME=Volume 13 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2023.1196414 DOI=10.3389/fonc.2023.1196414 ISSN=2234-943X ABSTRACT=Background: Recent developments in artificial intelligence suggest that radiomics may represent a promising non-invasive biomarker to predict response to immune checkpoint inhibitors (ICI). Nevertheless, validation of radiomics algorithms in independent cohorts remains a challenge due to variations in image acquisition and reconstruction. Using radiomics, we investigated the importance of scan normalization as part of a broader machine learning framework to enable model external generalizability to predict ICI response in non-small cell lung cancer (NSCLC) patients across different centers.Methods: Radiomics features were extracted and compared from 642 advanced NSCLC patients on pre-ICI scans using established open-source PyRadiomics, and a proprietary DeepRadiomics deep learning technology. The population was separated in two groups; a discovery cohort of 512 NSCLC patients from three academic centers, and a validation cohort included 130 NSCLC patients from a fourth center. We harmonized images to account for variations in reconstruction kernel, slice thicknesses, and device manufacturers. Multivariable models, evaluated using cross-validation, were used to estimate the predictive value of clinical variables, PD-L1 expression, and PyRadiomics or DeepRadiomics for progression-free survival at 6 months (PFS-6).The best prognostic factor for PFS-6, excluding radiomics features, was obtained with the combination of clinical + PD-L1 expression (AUC=0.66 in the discovery and 0.62 in the validation cohort). Without image harmonization, combining clinical + PyRadiomics or DeepRadiomics delivered an AUC=0.69 and 0.69 respectively in the discovery cohort, but dropped to 0.57 and 0.52, in the validation cohort. This lack of generalizability was consistent with observations in principal component analysis that clustered by CT scan parameters. Subsequently, image harmonization eliminated these clusters. The combination of clinical + DeepRadiomics model reached an AUC=0.67 and 0.63 in the discovery and validation cohort. Conversely, the combination of clinical + PyRadiomics failed generalizability validations, with AUC=0.66 and 0.59.We demonstrated that a risk prediction model combining clinical + DeepRadiomics was generalizable following CT scan harmonization and machine learning generalization methods. These results had similar performances than routine oncology practice using clinical + PD-L1. This study supports the strong potential of radiomics as a future non-invasive strategy to predict ICI response in advanced NSCLC.