AUTHOR=Cui Yunpeng , Shi Xuedong , Wang Shengjie , Qin Yong , Wang Bailin , Che Xiaotong , Lei Mingxing TITLE=Machine learning approaches for prediction of early death among lung cancer patients with bone metastases using routine clinical characteristics: An analysis of 19,887 patients JOURNAL=Frontiers in Public Health VOLUME=Volume 10 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2022.1019168 DOI=10.3389/fpubh.2022.1019168 ISSN=2296-2565 ABSTRACT=Purpose: The study’s objective is to create an accurate prediction model utilizing machine-learning techniques to predict three-month mortality specifically among lung cancer patients with bone metastases. Methods: This study enrolled 19887 lung cancer patients with bone metastases between 2010 and 2018 from a large oncologic database. According to a ratio of 8:2, the entire patient’s cohort was randomly assigned to a training (n=15881, 80%) and validation (n=4006, 20%) group. In the training group, prediction models were trained and optimized using six approaches, including logistic regression, XGBoosting machine, random forest, neural network, gradient boosting machine, and decision tree. Thirteen metrics, including the Brier score, calibration slope, intercept-in-large, area under the curve (AUC), and sensitivity, were used to assess the model’s prediction performance in the validation group. Risk stratification was also evaluated based on the optimal threshold. Results: Among all recruited patients, the three-month mortality was 48.5%. Twelve variables, including age, primary site, histology, race, sex, tumor (T) stage, node (N) stage, brain metastasis, liver metastasis, cancer-directed surgery, radiation, chemotherapy, were significantly associated with three-month mortality based on multivariate analysis, and these variables were included for developing prediction models. With the highest sum score of all the measurements, the gradient boosting machine approach outperformed all the other models (62 points), followed by the XGBooting machine approach (59 points) and logistic regression (53). The area under the curve (AUC) was 0.820 (95% confident interval [CI]: 0.807-0.833), 0.820 (95% CI: 0.807-0.833), and 0.815 (95% CI: 0.801-0.828), respectively, calibration slope was 0.97, 0.95, and 0.96, respectively, and accuracy was all 0.772. Compared to patients in the low-risk group, patients in the high-risk group were more than three times the odds of dying within three months (P<0.001). Conclusions: Using machine learning techniques, this study offers a number of models, and the optimal model is found after thoroughly assessing and contrasting the prediction performance of each model. The optimal model can be a pragmatic risk prediction tool and is capable of identifying lung cancer patients with bone metastases who are at a high risk for three-month mortality, informing risk counseling, and aiding clinical treatment decision-making.