AUTHOR=Lin Ming-Yen , Li Chi-Chun , Lin Pin-Hsiu , Wang Jiun-Long , Chan Ming-Cheng , Wu Chieh-Liang , Chao Wen-Cheng TITLE=Explainable Machine Learning to Predict Successful Weaning Among Patients Requiring Prolonged Mechanical Ventilation: A Retrospective Cohort Study in Central Taiwan JOURNAL=Frontiers in Medicine VOLUME=Volume 8 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2021.663739 DOI=10.3389/fmed.2021.663739 ISSN=2296-858X ABSTRACT=Objective: The number of patients requiring prolonged mechanical ventilation (PMV) is increasing worldwide, but the weaning outcome prediction model in these patients is still lacking. We hence aimed to develop an explainable machine learning (ML) model to predict successful weaning in patients requiring PMV using a real-world dataset. Methods: This retrospective study used the electronic medical records of patients admitted to a 12-bed respiratory care centre in central Taiwan between 2013 and 2018. We used 3 ML models, including extreme gradient boosting (XGBoost), random forest (RF) and logistic regression (LR), to establish the prediction model. We further illustrated the feature importance categorised by clinical domains and provided visualised interpretation by using Shapley Additive exPlanations (SHAP) as well as Local interpretable model-agnostic explanations (LIME). Results: The dataset contained data of the 963 patients requiring PMV, and 56.0% (539/963) of them successfully weaned from mechanical ventilation. The performance of XGBoost model (area under the curve: 0.908; 95% confidence interval (CI): 0.864–0.943) and RF model (AUC: 0.888; 95% CI 0.844–0.934) outperformed LR model (AUC: 0.762; 95% CI 0.687–0.830) to predict successful weaning in patients requiring PMV. To give the physician an intuitive understanding of the model, we stratified the feature importance by clinical domains. The cumulative feature importance in the ventilation domain, fluid domain, physiology domain and laboratory data domain was 0.310, 0.201, 0.265, and 0.182, respectively. We further used the SHAP plot and partial dependence plot to illustrate associations between features and the weaning outcome at the feature level. Moreover, we used LIME plots to illustrate the prediction model at the individual level. Additionally, we addressed the weekly performance of the 3 ML models and found the accuracy of XGBoost/RF was approximately 0.7 between week-4 and week-7 and slightly declined to 0.6 on week-8 and week-9. Conclusion: We used an ML approach, mainly XGBoost, SHAP plot and LIME plot to establish an explainable weaning prediction ML model in patients requiring PMV. We believe these approaches should largely mitigate the concern of black-box issue of artificial intelligence, and future studies are warranted for the landing of the proposed model.