Early prediction of acute kidney injury in patients with gastrointestinal bleeding admitted to the intensive care unit based on extreme gradient boosting

Background Acute kidney injury (AKI) is a common and important complication in patients with gastrointestinal bleeding who are admitted to the intensive care unit. The present study proposes an artificial intelligence solution for acute kidney injury prediction in patients with gastrointestinal bleeding admitted to the intensive care unit. Methods Data were collected from the eICU Collaborative Research Database (eICU-CRD) and Medical Information Mart for Intensive Care-IV (MIMIC-IV) database. The prediction model was developed using the extreme gradient boosting (XGBoost) model. The area under the receiver operating characteristic curve, accuracy, precision, area under the precision–recall curve (AUC-PR), and F1 score were used to evaluate the predictive performance of each model. Results Logistic regression, XGBoost, and XGBoost with severity scores were used to predict acute kidney injury risk using all features. The XGBoost-based acute kidney injury predictive models including XGBoost and XGBoost+severity scores model showed greater accuracy, recall, precision AUC, AUC-PR, and F1 score compared to logistic regression. Conclusion The XGBoost model obtained better risk prediction for acute kidney injury in patients with gastrointestinal bleeding admitted to the intensive care unit than the traditional logistic regression model, suggesting that machine learning (ML) techniques have the potential to improve the development and validation of predictive models in patients with gastrointestinal bleeding admitted to the intensive care unit.


Introduction
Acute kidney injury (AKI) is a common morbidity with a high incidence in patients admitted to the intensive care unit (ICU).It is associated with significant mortality, and a considerable proportion of patients develop AKI that progresses to chronic kidney disease (1)(2)(3).AKI has often been reported to occur in patients with gastrointestinal bleeding (GIB), especially those admitted to the ICU due to massive blood loss, leading to renal hypoperfusion secondary to intravascular volume depletion and eventually AKI (4,5).AKI has been reported to develop in 1-11.4% of patients with acute GIB (6, 7).A systematic review aimed to explore the incidence and mortality of renal dysfunction in cirrhotic patients with acute GIB revealed that the pooled incidence of AKI was 25% (8).
For critically ill patients with GIB concomitant with AKI, hospitalization times may be prolonged, and costs will greatly increase, bringing a heavy burden to the medical system (9)(10)(11).Approximately 20% of patients with severe GIB and newonset AKI can restore normal renal function if appropriate and effective interventions are performed on time (12).However, the lack of early prediction tools for AKI is a major challenge for ICU clinicians.Early recognition, risk assessment, and care for AKI can improve clinical outcomes and reduce the high healthcare costs of these patients.To assist physicians with risk assessment of AKI, various prediction models have been developed across various patient populations with varying degrees of predictive accuracy.
Models being built using machine learning (ML), which are mathematical models to make decisions and predictions based on datasets, have become popular, and ML techniques have been widely used clinically for prognosis prediction, including AKI (13).ML has shown better performance and low error rates in predicting clinical outcomes compared to traditional prediction tools such as logistic regression and Cox regression analysis.Moreover, ML has been widely used clinically to predict survival (14).Extreme gradient boosting (XGBoost) is recognized as a more advanced ML algorithm with much higher prediction accuracy and operation efficiency and has been widely applied for diagnosis and prognostic prediction (15).Recently, the use of ML models for AKI prediction has been rapidly growing in different clinical settings.Yue et al. reported that the XGBoost model had the best predictive performance for AKI in critically ill patients with sepsis (16).Zhang et al. evaluated five machine learning methods including XGBoost, adaptive boosting, random forest, logistic regression, and multilayer perception to develop AKI risk prediction models in critical care patients with acute cerebrovascular disease and found that the XGBoost model was better at predicting AKI risk in patients with acute cerebrovascular disease than other models (17).However, the efficacy of XGBoost in predicting AKI in critically ill patients with GIB remains unclear.
This study aimed to use XGBoost to construct a predictive model to evaluate AKI risk in critically ill patients with GIB and use the publicly available eICU Collaborative Research Database (eICU-CRD) as a data source for the training cohort and the Medical Information Mart for Intensive Care-IV (MIMIC-IV) database as a data source for the validation cohort.This study explored the accuracy of XGBoost for the construction of AKI prediction models and the extraction of important features.Furthermore, a shapely additive explanation (SHAP) analysis was used to reveal the influence of the major factors and provide comprehensive explanations of their quantitative impacts on output.In addition, the XGBoost model was compared with the traditional logistic model and score systems commonly used in the ICU, including the Oxford Acute Severity of Illness Score (OASIS), sequential organ failure assessment (SOFA), and acute physiology score III (APS III).The present study would provide a reference for an XGBoost-based clinical decision support system to aid the early prediction of AKI in patients with GIB admitted to the ICU setting.

Data source
All data were extracted from the eICU-CRD (18) and MIMIC-IV version 1.0 databases (19).The MIMIC-IV contains comprehensive and high-quality data of 524,520 admissions (including 257,366 patients) admitted to intensive care units (ICUs) at the Beth Israel Deaconess Medical Center during 2008-2019.The eICU-CRD covered 200,859 ICU admissions (including 139,367 patients) between 2014 and 2015 at 208 hospitals in the United States.The research use of these databases was approved by the institutional review board of the Massachusetts Institute of Technology.All procedures were performed in accordance with the ethical standards of the Declaration of Helsinki and its later amendments or comparable ethical standards.We obtained permission to extract data from the MIMIC-IV database and eICU-CRD database.

Cohort selection
GIB was defined according to the European Society of Gastrointestinal Endoscopy guideline (20,21); AKI was diagnosed according to the KDIGO-AKI criteria based on serum creatinine in the first 48 h of ICU admission (22).
Patients with one of the following conditions were excluded: (1) an age of <18 years at first admission to the ICU, (2) a hospital stay of <48 h, (3) >70% of personal data missing, (4) repeated ICU admissions, and (5) a history of end-stage renal disease (ESRD).Finally, 6,679 patients with eICU-CRD and 2,968 patients with MIMIC-IV were included in this study.Moreover, patients from the eICU-CRD were randomly divided into training (n = 5,679) and internal validation cohorts (n = 1,000) at a ratio of 7:3.Patients from MIMIC-IV (n = 2,968) were used as an external validation set.A detailed flowchart is shown in Figure 1.

Data collection and outcomes
Baseline characteristics and admission information, including age, sex, body mass index, bleeding site, comorbidities, severity  score, and drug usage, were recorded.Initial vital signs and laboratory results were also measured during the first 24 h of ICU admission (19).The primary outcome was AKI based on the KDIGO guidelines for serum creatinine within 48 h.

Statistical analysis
For all continuous covariates, the mean values and standard deviations were reported, and categorical data were expressed as frequency (percentage).The chi-square test or Fisher's exact test was performed to compare differences between groups.Baseline characteristics were reported as training and validation cohorts.Baseline characteristics were compared using R software version 4.1.0.A P-value of <0.05 was considered statistically significant.Modeling was performed using Python 3.6.4.

AKI prediction model
Logistic regression, XGBoost, and XGBoost+severity scores (SOFA, OASIS, and APS III) were applied to build the prediction models.The XGBoost model was used as previously reported (23,24).Moreover, all the machine-learning algorithms were implemented using the "sklearn" machine-learning library of Python programming software.The detailed XGBoost parameters are shown in Supplementary Table 1.The framework of the prediction models is illustrated in Figure 2.

Performance evaluation
To assess and compare the predictive accuracy of XGBoost, XGBoost+severity scores, and logistic regression models, each model was assessed according to precision, recall, accuracy, F1 score, area under the receiver operating characteristic (AUC) curve, and area under precision-recall curve (AUC-PR) (25).

Shapely additive explanation (SHAP) analysis
To further analyze the positive and negative effects of the important features identified for AKI prediction and investigate the relationship between them, a SHAP analysis was performed using Python 3.7.0.The SHAP value is the assigned predicted value for each feature of the data (26).

Baseline characteristics
The incidence rate of AKI was 15.0% in the training cohort, 13.7% in the internal validation cohort, and 30.7% in the external validation cohort.Table 1 shows the baseline characteristics of all patients in the training, internal validation, and external validation cohorts, as classified by NAKI and AKI.

Variable selection
The importance matrix plot for the XGBoost model is shown in Figure 3, revealing the top 15 most important variables that contribute to the model.Bilirubin (max) was the most important predictor variable for all prediction horizons, followed closely by bicarbonate (min), renal replacement therapy (RRT), mechanical ventilation, and bilirubin (first time).

Model performance
Three models, logistic regression, XGBoost, and XGBoost+severity scores, were used to predict AKI risk using all features.The accuracy, recall, precision, F1 score, AUC-PR, and AUC of XGBoost were higher than those of the logistic regression model.When XGBoost+severity scores (SOFA, OASIS, and APS III) were used, this model exhibited the best predictive ability compared to the XGBoost model only, as well as the logistic regression model with the highest accuracy, recall, precision, F1 score, AUC-PR, and AUC in the training cohort.The results in the internal and external validation cohorts were similar to the
Frontiers in Medicine frontiersin.orgresults in the training cohort (Table 2).Furthermore, ROC analysis was also performed to further check the performance of the three models, as shown in Figures 4A-C.The XGBoost+severity score model exhibited the largest AUC, followed by the XGBoost model, in all training, internal validation, and external validation cohorts.The calibration curves for the predictive models (logistic regression model, XGBoost model, and XGBoost+severity scores model) all showed high agreement between the actual probability and predicted probability in the training, internal validation, and external validation sets (Figures 5A-C).Subsequently, a decision curve analysis (DCA) was performed to determine the net benefit and clinical utility of the predictive models.The DCA curve also indicated that the three predictive models were all clinically useful and that the benefit of using the XGBoost+severity score model was superior to that of using the XGBoost and logistic regression models in all sets (Figures 5D-F).

SHAP analysis
To examine the influence of characteristics on the prediction results in more samples and analyze the similarities and differences in the important characteristics of patients with varying severities of AKI with different severities, a SHAP summary chart was used.As shown in Figure 6, bilirubin (max) ranked first in importance; the larger the bilirubin (max) in patients, the higher the probability of AKI development, suggesting that this indicator should be observed first in early prediction.
Using all features as an example in the XGBoost model, which has excellent performance for predicting AKI, as well as the SHAP analysis method, representative non-AKI and AKI patients were selected to illustrate the effect of features on prediction ability.As shown in Figure 7, for predicting non-AKI patients, mechanical ventilation (mechvent) played a major positive role in prediction results, sodium (min) played a major negative role in predicting outcomes, and the SHAP value of the final model predicted for this patient was −0.25, which is <0 and, therefore, considered to have successfully predicted the absence of AKI.For predicting AKI patients, the bicarbonate plays a major positive role in prediction results, the bilirubin (max) plays a major negative role in predicting outcomes, and the SHAP value of the final model predicted for this patient was 1.23, which is considered to have successfully predicted AKI.

Discussion
Few studies have explored AKI-prediction models based on machine-learning techniques in critically ill GIB settings.The present study compared the predictive accuracy of the prediction of AKI in patients with GIB admitted to the ICU using the machine learning technique XGBoost, traditional statistical approach logistic regression analysis, and previous risk scoring models (SOFA, OASIS, and APS III).The results have shown that the XGBoost model had the largest AUC, accuracy, precision, and recall among all the techniques and risk scores.Moreover, the XGBoost+severity scores (SOFA, OASIS, and APS III) exhibited better AKI prediction performance than XGBoost.The XGBoost model-based prediction may induce a significant improvement in the prediction of AKI in patients with GIB admitted to the ICU.A risk estimator based on the XGBoost model was developed to determine the risk of AKI in high-risk patients with GIB.
Acute GIB is very common in patients in ICU (27).Mortality in patients with acute GIB is very high, approaching 48.5-65% (28,29).According to previous reports, AKI occurs in ∼25% of patients with acute GIB.Although AKI accounts for a small number of complications in critically ill patients with GIB, the mortality rate of critically ill patients with AKI is higher than that of patients with severe GIB without AKI.Xie et al. found that AKI occurred in 30% of patients with cirrhosis and that patients with cirrhosis and AKI had a worse prognosis (37 vs. 3%) (30).Moreover, a study by Kim et al. also showed that the 6-week mortality rate of cirrhotic patients with new-onset AKI was significantly higher than that of patients without AKI (31).The early identification of AKI can effectively prevent disease progression.However, there is currently a lack of reliable and effective predictive models for such patients, warranting researchers to develop a reliable AKI predictive model to identify high-risk critically ill patients with GIB.
With the advent of big data, ML has great potential in the field of AKI research owing to its unparalleled ability in data processing.Therefore, machine learning models may be powerful tools for AKI risk stratification and prediction (32).Several ML techniques have been used to predict AKI in different disease settings (33)(34)(35).However, the use of ML techniques to predict AKI in critically ill patients with GIB has not been investigated.As an ML technique, XGBoost is a highly efficient boosting ensemble learning model that originated in the decision tree model, using a tree classifier for better prediction results and higher operation efficiency (36).Several studies have found that XGBoost is superior to other machine learning techniques.Liu et al. reported that XGBoost exhibited the best performance in predicting mortality in patients with AKI in the ICU, with the highest AUC, F1 score, and accuracy compared with logistic regression, support vector machines, and random forest (37).Yue et al. aimed to establish and validate predictive models based on novel machine learning algorithms for AKI in critically ill patients with sepsis and found that the XGBoost model had the best predictive performance in terms of discrimination, calibration, and clinical application among all models, including logistic regression, SOFA, and the customized Simplified Acute Physiology Score (SAPS) II model (16).Qu et al. used support vector machine, random forest, classification and regression tree, and XGBoost models to predict AKI prediction, and compared to the predictive performance of the classic model using logistic regression, the results demonstrated that XGBoost performed best in predicting AKI among the machine learning models (34).Hence, the XGBoost algorithm was selected to structured and unstructured patient data from electronic medical records to develop an AKI prediction model in the present study.Consistent with previous reports (16,34,37), the XGBoost model was better than the traditional logistic regression model and previous risk scoring models (  A SHAP analysis was used to determine the quantitative impact of each feature on AKI prediction based on SHAP values.The results of our study demonstrated that bilirubin and albumin were the most influential feature among all other physiological measurements A UK-wide study in acute medical units aimed to investigate patients who were at risk of developing AKI in hospitals and found that elevated serum bilirubin was independently associated with AKI development (44).Moreover, Wang et al. indicated that lower serum albumin levels were independently associated with a greater risk of contrast-induced AKI among patients who underwent percutaneous coronary intervention (45).Moreover, mechanical ventilation (mechvent),  bicarbonate, and RRT also displayed strong predictive powers, which reflected their roles in AKI prediction in critically ill patients with GIB.
Nevertheless, this study has some limitations.First, the present study extracted data from two large public databases, and additional external clinical datasets may be needed to verify the results of this study.Second, we collected data during the first 24 h of ICU stay, and more dynamic time-point data are needed in future studies.Moreover, the variables we stated indicate that the predictive model's utilities are challenging, as they are at different time points (e.g., patients' first creatinine and highest bilirubin).Therefore, in reality, it would not be possible to use them to predict AKI risk until all time-point data were collected.Finally, the present study included an imbalanced dataset to check the performance of the machine learning and the predictive model developed using the machine learning algorithms could be biased and inaccurate.The results of this study should be further validated in the future.

Conclusion
This study utilized an XGBoost-based model to predict AKI in patients with GIB admitted to the ICU.The results demonstrated that it is feasible to apply the XGBoost-based prediction models for the management of critically ill patients with GIB and that this model has better predictive performance than that of classic logistic regression methods and severity score models.The XGBoost-based model in this study has not been verified by an external cohort, and further studies are needed to determine the clinical application of the XGBoost-based model and to perform prospective and large sample experiments to verify our conclusion.

FIGURE
FIGUREThe flow chart of this study.

FIGURE
FIGUREFramework of the prediction models.

FIGURE
FIGUREROC curves of the prediction models using all features as well as three common severity scores for predicting AKI in the training set (A) and in the internal validation set (B) and in the external validation set (C).

FIGURE
FIGURE The performance of the models of logistic, XGBoost, and XGBoost+severity scores for AKI.The calibration curves of the logistic, XGBoost, and XGBoost+severity scores for AKI in the training set (A), in the internal validation set (B), and in the external validation set (C).The decision curve analysis of the logistic, XGBoost, and XGBoost+severity scores for AKI in the training set (D), in the internal validation set (E), and in the external validation set (F).

FIGURE
FIGURESHAP summary plot of the features of the XGBoost model.The higher the SHAP value of a feature, the higher the probability of AKI development.A dot is created for each feature attribution value for the model of each patient, and thus one patient is allocated one dot on the line for each feature.Dots are colored according to the values of features for the respective patient and accumulate vertically to depict density.Red represents higher feature values, and blue represents lower feature values.Creatinine_max, maximum serum creatinine; first_time_creatinine, the first measurement of serum creatinine after their ICU admission; bilirubin_min, minimum bilirubin; mechvent, mechanical ventilation.

FIGURE
FIGUREThe two representative SHAP force plots of non-AKI and AKI patients in the training set.
TABLE Comparisons of baseline characteristics in all cohorts.
SOFA, OASIS, and APS III).The XGBoost+severity score model exhibited the highest accuracy, recall, precision, AUC, AUC-PR, and F1 score.There are several studies in this respect.Although the studies have been conducted on different data and are not comparable, studies employed traditional ML techniques to TABLE Performance of the prediction models using all features.