AUTHOR=Li Wenle , Hong Tao , Liu Wencai , Dong Shengtao , Wang Haosheng , Tang Zhi-Ri , Li Wanying , Wang Bing , Hu Zhaohui , Liu Qiang , Qin Yong , Yin Chengliang TITLE=Development of a Machine Learning-Based Predictive Model for Lung Metastasis in Patients With Ewing Sarcoma JOURNAL=Frontiers in Medicine VOLUME=Volume 9 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2022.807382 DOI=10.3389/fmed.2022.807382 ISSN=2296-858X ABSTRACT=Background: This study aimed to develop and validate machine learning (ML) based prediction models for lung metastasis (LM) in patients with Ewing Sarcoma (ES), and deployed the best performed model in an open access web calculator. Methods: We retrospectively analyzed data from the Surveillance Epidemiology and End Results (SEER) Database from 2010 to 2016 and four medical institutions to develop and validate the predictive models for LM in patients with ES. The data from SEER database was used as the training group (n=929). Using demographic and clinicopathologic variables of patients six ML-based models for predicting LM were developed, and internally validated using the 10-fold cross validation. All ML-based models were externally validated by the validation group (n=51), consisting of multiple data from four medical institutions. The predictive performance of models was evaluated by the area under receiver operating characteristic curve (AUC), the best-performing ML algorithm model was used for illustrating an online calculator to estimate an individual’s possibility of LM. Results: This study cohort consisted of 929 patients from the SEER database and 51 patients from multiple centers, for a total of 980 ES patients, with 175 (18.8%) had lung metastasis. In multivariate logistic regression analysis, survival time, T stage, N stage, surgery, and bone metastasis were independent predictive factors of LM. The AUC value of six predictive models ranged from 0.585 to 0.705 in predicting LM, the Random Forest model (AUC=0.705) with 4 variables was identified as the best predictive model and was employed to construct an online calculator (https://share.streamlit.io/liuwencai123/es_lm/main/es_lm.py ). Conclusions: ML algorithms showed acceptable results in predicting LM in patients with Ewing Sarcoma, and the Random Forest model performed best. The web calculator based on ML algorithms may guide better personalized treatment for patients with ES.