AUTHOR=Cheng Xiaoyun , Li Jinzhang , Xu Tianming , Li Kemin , Li Jingnan 

TITLE=Predicting Survival of Patients With Rectal Neuroendocrine Tumors Using Machine Learning: A SEER-Based Population Study

JOURNAL=Frontiers in Surgery

VOLUME=Volume 8 - 2021

YEAR=2021

URL=https://www.frontiersin.org/journals/surgery/articles/10.3389/fsurg.2021.745220

DOI=10.3389/fsurg.2021.745220

ISSN=2296-875X

ABSTRACT=Background
The number of patients diagnosed with rectal neuroendocrine tumors (R-NETs) is increasing year by year. An integrated survival predictive model to predict the prognosis of R-NETs is required. The present study is aimed at exploring epidemiological characteristics of R-NETs based on a retrospective study from the Surveillance, Epidemiology, and End Results (SEER) database and predicting survival of R-NETs with machine learning.
Methods
Data of patients with R-NETs were extracted from the SEER database (2000-2017), and data were also retrospectively collected from a single medical center in China. The main outcome measure was the five-year survival status. Risk factors affecting survival were analyzed by Cox regression analysis and six common machine learning algorithms were chosen to build predictive models. Data from SEER database were divided into a training set and an internal validation set according to the year 2010 as a time point. Data from China were chosen as an external validation set. The best machine learning predictive model was compared with the American Joint Committee on Cancer (AJCC) 7th staging system to evaluate its predictive performance in the internal validation dataset and external validation dataset.
Results
A total of 10580 patients from the SEER database and 68 patients from a single medical center were included in the analysis. Age, gender, race, histologic type, tumor size, tumor number, summary stage, and surgery treatment were risk factors affecting survival status. After parameters adjustment and algorithms comparison, the predictive model using XGBoost algorithm had the best predictive performance in the training set (AUC=0.87, 95%CI: 0.86-0.88). In the internal validation, the predictive ability of XGBoost was better than that of the AJCC 7th staging system (AUC: 0.90 vs 0.78). In the external validation, the XGBoost predictive model (AUC = 0.89) performed better than the AJCC 7th staging system (AUC = 0.83).
Conclusions
The XGBoost algorithm had a better predictive power than the AJCC 7th staging system, which had a potential value of clinical application.