AUTHOR=Peng Ting , Liu Leping , Liu Feiyang , Ding Liang , Liu Jing , Zhou Han , Liu Chong TITLE=Machine learning-based infection prediction model for newly diagnosed multiple myeloma patients JOURNAL=Frontiers in Neuroinformatics VOLUME=Volume 16 - 2022 YEAR=2023 URL=https://www.frontiersin.org/journals/neuroinformatics/articles/10.3389/fninf.2022.1063610 DOI=10.3389/fninf.2022.1063610 ISSN=1662-5196 ABSTRACT=Objective: To understand the infection characteristics and risk factors for infection by analyzing the clinical data of multicenter newly diagnosed multiple myeloma (NDMM) patients. METHODS: This study reviewed 564 patients with NDMM in 2 large tertiary hospitals from January 2018 to December 2021, of which 395 were in the training set and 169 were in the validation set. Thirty-eight variables of first admission records were collected, including patient demographic characteristics, clinical scores and characteristics, laboratory indicators, complications, and medication history, and key variables were screened using the Lasso method. Multiple machine learning algorithms were compared, and the best performing algorithm was used to build a machine learning prediction model. The model performance was evaluated using the AUC(Area Under Curve), accuracy, and Youden’s index. Finally, the SHAP package was used in two cases to demonstrate the application of the model. RESULTS: In this study, 15 important key variables were selected, including age, ECOG, osteolytic disruption, VCD, neutrophils, lymphocytes, monocytes, hemoglobin, platelets, albumin, creatinine, lactate dehydrogenase, affected globulin, β2 microglobulin, and preventive medicine. The predictive performance of the XGBoost model was significantly better than that of the other models (AUC: 0.8664), and it also performed well in the expected dataset (accuracy: 68.64%). A machine learning algorithm was used to establish an infection prediction model for newly diagnosed multiple myeloma patients that was simple, convenient, validated, and performed well to reduce the incidence of infection and improve the prognosis of patients.