AUTHOR=Yin Minyue , Zhang Rufa , Zhou Zhirun , Liu Lu , Gao Jingwen , Xu Wei , Yu Chenyan , Lin Jiaxi , Liu Xiaolin , Xu Chunfang , Zhu Jinzhou TITLE=Automated Machine Learning for the Early Prediction of the Severity of Acute Pancreatitis in Hospitals JOURNAL=Frontiers in Cellular and Infection Microbiology VOLUME=Volume 12 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/cellular-and-infection-microbiology/articles/10.3389/fcimb.2022.886935 DOI=10.3389/fcimb.2022.886935 ISSN=2235-2988 ABSTRACT=Background Machine learning (ML) algorithms are widespread applied in building models of medicine due to its powerful studying and generalizing ability. This study aims to explore different ML models for early identification of severe acute pancreatitis (SAP) among patients hospitalized for acute pancreatitis. Methods This retrospective study enrolled patients with acute pancreatitis (AP) from multiple centers. Data from the first affiliated hospital and Changshu No.1 hospital of Soochow university was adopted for training and internal validation and data from the second affiliated hospital of Soochow university for external validation from Jan 2017 to Dec 2021. The diagnosis of AP and SAP was based on the revised Atlanta classification of acute pancreatitis 2012. Models were built using traditional logistic regression (LR) and automated machine learning (AutoML) analysis with five types of algorithms. The performance of models was evaluated by the receiver operating characteristic (ROC) curve, the calibration curve and the decision curve analysis (DCA) based on LR and feature importance, SHapley additive explanation plot (SHAP) and local Interpretable model agnostic explanation (LIME) based on AutoML. Results A total of 1012 patients were included in this study to develop the AutoML models in the training/validation dataset. An independent dataset of 212 patients was used to test the models. The model developed by gradient boost machine (GBM) outperformed other models with area under ROC curve (AUC) of 0.937 in the validation set and AUC of 0.945 in the test set. Furthermore, the GBM model achieved highest sensitivity value of 0.583 among these AutoML models. The model developed by eXtreme Gradient Boosting (XGBoost) achieved the highest specificity value of 0.980 and highest accuracy of 0.958 in the test set. Conclusions The AutoML model based on the GBM algorithm for early predicting SAP showed evident clinical practicability.