AUTHOR=Cheng Xiaoyun , Li Jinzhang , Xu Tianming , Li Kemin , Li Jingnan TITLE=Predicting Survival of Patients With Rectal Neuroendocrine Tumors Using Machine Learning: A SEER-Based Population Study JOURNAL=Frontiers in Surgery VOLUME=Volume 8 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/surgery/articles/10.3389/fsurg.2021.745220 DOI=10.3389/fsurg.2021.745220 ISSN=2296-875X ABSTRACT=Background The number of patients diagnosed with rectal neuroendocrine tumors (R-NETs) is increasing year by year. An integrated survival predictive model to predict the prognosis of R-NETs is required. The present study is aimed at exploring epidemiological characteristics of R-NETs based on a retrospective study from the Surveillance, Epidemiology, and End Results (SEER) database and predicting survival of R-NETs with machine learning. Methods Data of patients with R-NETs were extracted from the SEER database (2000-2017), and data were also retrospectively collected from a single medical center in China. The main outcome measure was the five-year survival status. Risk factors affecting survival were analyzed by Cox regression analysis and six common machine learning algorithms were chosen to build predictive models. Data from SEER database were divided into a training set and an internal validation set according to the year 2010 as a time point. Data from China were chosen as an external validation set. The best machine learning predictive model was compared with the American Joint Committee on Cancer (AJCC) 7th staging system to evaluate its predictive performance in the internal validation dataset and external validation dataset. Results A total of 10580 patients from the SEER database and 68 patients from a single medical center were included in the analysis. Age, gender, race, histologic type, tumor size, tumor number, summary stage, and surgery treatment were risk factors affecting survival status. After parameters adjustment and algorithms comparison, the predictive model using XGBoost algorithm had the best predictive performance in the training set (AUC=0.87, 95%CI: 0.86-0.88). In the internal validation, the predictive ability of XGBoost was better than that of the AJCC 7th staging system (AUC: 0.90 vs 0.78). In the external validation, the XGBoost predictive model (AUC = 0.89) performed better than the AJCC 7th staging system (AUC = 0.83). Conclusions The XGBoost algorithm had a better predictive power than the AJCC 7th staging system, which had a potential value of clinical application.