AUTHOR=Guo Yuan-yuan , Li Zhi-jie , Du Chao , Gong Jun , Liao Pu , Zhang Jia-xing , Shao Cong TITLE=Machine learning for identifying benign and malignant of thyroid tumors: A retrospective study of 2,423 patients JOURNAL=Frontiers in Public Health VOLUME=Volume 10 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2022.960740 DOI=10.3389/fpubh.2022.960740 ISSN=2296-2565 ABSTRACT=Objective The risk-factors of identifying benign and malignant of thyroid tumors were selected, and the prediction model were established, in order to provide early warning for patients. Methods Patients were selected from Chongqing People's Hospital (Chongqing, China) from July 2020 to September 2021. The predictors we collected include: Sex, BRAFV600E, Age, Lymph#, Neu#, NLR, PLR, RDW, PLT, RDW-CV, ALP, PTH. Furthermore, extreme gradient boosting (XGBoost), random forest (RF), light gradient boosting machine (LightGBM) and Adaptive Boosting (AdaBoost) were used to build predictive models. Results A total of 2042 patients met the inclusion criteria and were enrolled into this study, and 11 variables were included. We build a predictive tool that includes three categories of models. The first model included all the predictors we collected. The second model incorporated patient demographic information and the BRAFV600E gene predictor. And the third model included patient demographic information and blood routine and biochemical test predictors. Among the model our construct, XGBoost, RF, LightGBM and AdaBoost with the Area Under the Curve (AUC) of 0.868(0.834-0.901), 0.848(0.811-0.885), 0.864(0.831-0.898), and 0.846(0.812-0.881) in the first model. With the AUC of 0.833(0.796-0.869), 0.833(0.798-0.868), 0.836(0.801-0.871), and 0.828(0.791-0.865) in the second model. However, the AUC of 0.695(0.648-0.742), 0.680(0.632-0.729), 0.671(0.622-0.719), and 0.630(0.580-0.680) in the third categories. Variables that were important for all algorithms were baseline BRAFV600E gene mutation status components. Conclusions Our findings revealed the model we put forward was relatively stable, which could be a powerful and promising tool for the prediction of identifying benign and malignant of thyroid tumors.