A Machine Learning-Based Prediction of Hospital Mortality in Patients With Postoperative Sepsis

Introduction: The incidence of postoperative sepsis is continually increased, while few studies have specifically focused on the risk factors and clinical outcomes associated with the development of sepsis after surgical procedures. The present study aimed to develop a mathematical model for predicting the in-hospital mortality among patients with postoperative sepsis. Materials and Methods: Surgical patients in Medical Information Mart for Intensive Care (MIMIC-III) database who simultaneously fulfilled Sepsis 3.0 and Agency for Healthcare Research and Quality (AHRQ) criteria at ICU admission were incorporated. We employed both extreme gradient boosting (XGBoost) and stepwise logistic regression model to predict the in-hospital mortality among patients with postoperative sepsis. Consequently, the model performance was assessed from the angles of discrimination and calibration. Results: We included 3,713 patients who fulfilled our inclusion criteria, in which 397 (10.7%) patients died during hospitalization, and 3,316 (89.3%) patients survived through discharge. Fluid-electrolyte disturbance, coagulopathy, renal replacement therapy (RRT), urine output, and cardiovascular surgery were important features related to the in-hospital mortality. The XGBoost model had a better performance in both discriminatory ability (c-statistics, 0.835 vs. 0.737 and 0.621, respectively; AUPRC, 0.418 vs. 0.280 and 0.237, respectively) and goodness of fit (visualized by calibration curve) compared to the stepwise logistic regression model and baseline model. Conclusion: XGBoost model has a better performance in predicting hospital mortality among patients with postoperative sepsis in comparison to the stepwise logistic regression model. Machine learning-based algorithm might have significant application in the development of early warning system for septic patients following major operations.


INTRODUCTION
Sepsis is a severe complication following major surgery and responsible for poor outcomes of postoperative patients by inducing multiple organ dysfunction and increasing in-hospital mortality. Although great progress has been made in the early recognition and therapeutic strategies, the incidence and mortality of septic complications remain unacceptably high (1)(2)(3). It has been documented that there are ∼30% of septic patients after surgical procedures, and the number of patients who developed postoperative sepsis increases annually (4)(5)(6). Given the high incidence and poor prognosis, the Agency for Healthcare Research and Quality (AHRQ) defined the "postoperative sepsis" as a critical indicator for patients' safety, which mainly focused on preventable surgical complications and iatrogenic events after surgical procedures (7,8).
Various evidences have demonstrated that immunocompromised state is strongly associated with the pathogenesis of postoperative sepsis (9). For example, impaired antigen presenting capacity of monocytes and dominant differentiation of type 2 helper T cells were all characterized in the animal models of postoperative sepsis (10)(11)(12). Meanwhile, researchers identified disparate gene expression profiles of whole blood cells from surgical patients with or without postoperative sepsis, and found that the expression patterns of interleukin (IL) 1β (IL-1β), tumor necrosis factor (TNF) superfamily, member 2, and CD3D were significantly different (13). However, the "Surviving Sepsis Campaign" (SSC) guidelines didn't provide distinctive treatments for postoperative sepsis (14). Moreover, there were insufficient clinical trials that specifically testified the guidelines in the postoperative sepsis cohort. Most of the studies examined the short-term mortality in septic patients admitted to emergency department or intensive care unit (ICU) that covered multiple types of sepsis (8,15,16). On the contrary, few studies specifically characterized the clinical outcomes of patients with postoperative sepsis.
In the present study, we aimed to establish a predictive model on in-hospital mortality among patients with postoperative sepsis. Given the limitation of conventional statistical methods in processing retrospective data that contained covariates of high correlation and inevitable missing values, we enrolled advanced machine learning algorithm, called extreme gradient boosting (XGBoost), to identify the important clinical features for predicting in-hospital mortality.

Database
Medical Information Mart for Intensive Care-III (MIMIC-III), a large online critical care database, was applied for the current study (17). Of note, MIMIC-III was a comprehensive dataset which contained clinical data of all the patients admitted to ICU of Beth Israel Deaconess Medical Center (BIDMC) in Boston, Massachusetts, from 2001 to 2012. In brief, it included more than fifty thousand distinct adult (aged >16 years) ICU patients and approximately eight thousand neonate cases. We had obtained the permission for accessing the database after the completion of "Protecting Human Research Participants, " an online training course launched by National Institutes of Health (NIH) (certification number: 32450965). We conducted this study in accordance with the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) recommendation (18).

Study Population
The selection of patients was based on "postoperative sepsis" criteria proposed by AHRQ combining with Sepsis 3.0 criteria, in which sepsis was diagnosed by sequential organ failure assessment (SOFA) score ≥2 plus documented or suspected infection (7,19). Additionally, infection was confirmed in accordance with ICD-9 code in the MIMIC-III database. In this study, we included all patients (aged >18 years) who underwent surgical procedures prior to ICU admission and fulfilled Sepsis 3.0 criteria within 24 h post ICU admission. Patients were excluded even if they were in line with AHRQ selection criteria: (1) who had a principal or secondary diagnosis of sepsis or infection on admission; (2) who were diagnosed with cancer and had other immunocompromised state, including hematologic malignancies, HIV, prolonged usage of corticosteroids, and organ transplantation; (3) who were admitted to ICU with pregnancy, childbirth, or puerperium; (4) who stayed in hospital <4 days; (5) who had incomplete or unobtainable medical data records on admission.

Variables Extraction and Outcome Measurement
Clinical and laboratory variables were collected within the first 24 h after ICU admission. Demographic data was obtained, including age, gender, body mass index (BMI), and elective surgical type. Laboratory findings, including white blood cell (WBC) counts, hematocrit, platelet counts, glucose, lactate, creatinine, blood urea nitrogen (BUN), coagulation profile, chloride, potassium, sodium, bicarbonate, albumin, bilirubin, partial pressure of arterial oxygen (PaO 2 ), partial pressure of arterial carbon dioxide (PaCO 2 ), total CO 2, and pH were incorporated. In addition, vital signs, including blood pressure, respiratory rates, heart rates, and body temperature were included. Comorbidities, such as congestive heart failure, cardiac arrhythmia, neurological disorders, diabetes, anemia, and obesity, were also recorded. Prognostic scoring systems, including SOFA score, Oxford Acute Severity of Illness Score (OASIS), Simplified Acute Physiology Score II (SAPSII), and Glasgow Coma Scale (GCS) were calculated and analyzed by using variables obtained in the first 24 h during admission. Notably, both the maximum and minimum values of some indicators were collected and analyzed for multiple measurements.
As severe data missing might render bias, all eligible predictors were screened, and variables with more than 30% missing values were not taken into subsequent model establishment. Correspondingly, we conducted multivariate imputation for variables with <30% missing values.
We chose in-hospital mortality as our primary endpoint, which was defined as survival status at hospital discharge.
Patients without outcome information were excluded from the final cohort.

Statistical Analysis
Baseline characteristics of enrolled participants were presented and compared between survivors and non-survivors by applying either Student t-test, Chi-square test and Mann-Whitney U-test as appropriate. Continuous variables were characterized as mean (standardized differences [SD]) or median (interquartile range [IQR]), while categorical or ranked data were reported as count and proportion.
We employed stepwise logistic regression model to select predictors of in-hospital mortality. Both forward and backward directions were used in variable selection processes, in which Akaike Information Criterion (AIC) was applied as the selection criteria of the optimal model. Furthermore, we applied Extreme Gradient Boosting (XGBoost) model to predict in-hospital death among patients with postoperative sepsis. XGBoost was a machine learning algorithm, which mainly functioned as iterative refit of weak classifier to residuals of previous models, meaning that the current weak classifier was generated based on previous one in order to optimize the predictive efficiency (20,21). In each round of iteration, it focused more on misclassified observations. As eligible variables were included into the model, it outputted the importance score of each variable. Meanwhile, XGBoost could automatically process missing data through assigning a default direction to the null values. To reach the optimal model performance of XGBoost, we assessed and tuned the hyperparameters, including learning rates, maximum depth of a tree, number of estimators, alpha, and lambda. In this study, the original dataset was randomly divided into 5 subsets. One-fold was used as testing subset, while the other four-fold were processed to tune the hyperparameters, in which 25% were applied for calibration, and four-fold cross validation with grid search was conducted in remaining 75% of data. The hyperparameters with the highest area under the receiver operator characteristic curve (AUROC) were selected. The sufficiently tuned XGBoost hyperparameters were subsequently added back for training and calibrating the model, which was further validated in one held-out testing subset (22). Detailed process for tuning hyperparameters was provided in Supplemental Figure S1.
Model performance of both models was assessed in multiple dimensions. To test discriminatory ability, we used receiver operating characteristic (ROC) curve and c-statistic. Meanwhile, calibration plot revealed the correlation between observed and predicted risk, which was applied to evaluate the goodness of fit. The area under the precision-recall curve (AUPRC) provides a robust metric for unbalanced datasets, which has been a critical measure in assessing model performance. Given that, precisionrecall curve with AUPRC, accuracy and recall were also applied to evaluate the performance of models. Of note, SOFA score that was commonly used for evaluating the severity of septic patients was assigned as a baseline model, and compared with stepwise logistic regression and XGBoost models as well. Aforementioned statistical analyses were performed by using IBM SPSS Statistics software (version 23.0), Python software (version 3.4.3), and R software (version 3.6.1). Two tailed P < 0.05 was deemed as statistical significance.

Participants
Among 46,520 patients in the MIMIC-III database, 15,302 of them met with Sepsis 3.0 criteria. There were 4,653 potentially eligible adult patients (aged ≥18 years) who underwent surgical procedures prior to ICU admission. After excluded 940 patients in accordance with the AHRQ exclusion criteria, 3,713 patients were deemed to develop postoperative sepsis and were eventually incorporated into the study cohort, in which 397 (10.7%) patients died during hospitalization and 3,316 (89.3%) of them survived through discharge. The detailed information with regard to the enrollment and selection process was presented in Figure 1.

Stepwise Logistic Regression Model
We performed stepwise logistic regression analysis with both forward and backward methods, in which the classifier incorporated 36 variables into the final model. As shown in

XGBoost Model
After tuning and grid search, the hyperparameters applied in the current XGBoost model were as follows: learning rates = 0.01, number of estimators = 1,000, maximum depth of a tree = 5, alpha = 0, and lambda = 0. The importance of feature was assigned by weight which was calculated by the number of times that a feature was used to split the data across all trees. Feature importance revealed the relative contribution of each variable on predicting the inhospital mortality. As shown in Figure 2, the fluid-electrolyte disturbance and coagulopathy were the top ranked variables that were correlated with in-hospital death among patients with postoperative sepsis, followed by renal replacement therapy (RRT), urine output, cardiovascular surgery, and digestive disorders.

Evaluation of Model Performance
The discriminatory power of both stepwise logistic and XGBoost models was evaluated by using ROC analysis and c-statistics (calculated by AUROC) in the testing subset. The XGBoost had a significantly higher c-statistics compared to that of the stepwise logistic regression and baseline models (c-statistics, 0.835 vs. 0.737 and 0.621, respectively), suggesting a better discriminative capacity of XGBoost model ( Figure 3A). As presented in Figure 3B, the XGBoost model also performed better in terms of precision-recall curve when compared to stepwise logistic regression and baseline models (AUPRC, 0.418 vs. 0.280 and 0.237, respectively). Besides, the accuracy for XGBoost and stepwise logistic models were 0.88 and 0.76, respectively. The recall for XGBoost and stepwise logistic models were 0.10 and 0.75, respectively. Meanwhile, as shown in Figure 4, the calibration curve of models showed that XGBoost presented a greater goodness of fit than logistic regression model and SOFA score.

Major Findings
In the current study, we identified various clinical indicators that were associated with increased in-hospital mortality among ICU patients with postoperative sepsis. By applying sophisticated machine learning algorithm, we found that fluid-electrolyte disturbance, coagulopathy, RRT, urine output, and cardiovascular surgery were significant features for predicting in-hospital death. In addition, XGBoost model revealed a better performance in discrimination and calibration than that of the conventional stepwise logistic regression model.

Relation to Other Works
Plenty of evidence have indicated that the development of sepsis is critically involved in short-term and long-term mortality of postsurgical patients (8,15,23,24). A large nationwide epidemiology of patients with elective surgery revealed an increased incidence of postoperative sepsis, ranging from 0.3% in 1997 to 0.9% in 2006, while they found that the inhospital morality significantly decreased from 1997 to 2006 (44.4-30%) (23). Recently, Ou et al. conducted a populationbased analysis in patients who underwent coronary artery bypass grafting (CABG) surgery, and they noticed that the incidence of postoperative sepsis was ∼2%, and the mortality of those patients admitted to public hospital and private were 11.9% and 18.3%, respectively (15). In a retrospective analysis by Mørch et al., researchers focused on the clinical outcomes of patients who developed postoperative sepsis after hip fracture surgery. They documented a 30-day mortality of 15.8% among those patients, which was significantly higher than patients without postoperative sepsis (24). In our study, we identified an in-hospital mortality of 10.7% among ICU patients who developed postoperative sepsis. We observed an evident decline in mortality rates among surgical patients with sepsis over the past decades, while the morbidity rates showed sustained increase. The reduction of overall mortality rates might be attributed to the progress in perioperative care and extensive use of antibiotics. Meanwhile, the mortality of patients with postoperative sepsis was disparate from that of the other types of septic patients, which could be explained by different clinical settings and co-morbidities state.

Clinical Implications
The XGBoost model is capable of accurately predicting inhospital death among patients with postoperative sepsis.
Although several studies have identified the risk factors for the short-term or long-term mortality of septic patients following major operations, few of them establish feasible models to predict clinical outcomes of those patients. Unlike other types of sepsis, postoperative sepsis had some unique characteristics in both etiology and pathophysiology, which made it a specific subset (5). Therefore, it is of great importance to early recognize patients with postoperative sepsis who are at high risk of death and to identify preventable indicators. Since the recent advancements in machine learning techniques, the magnitude of variables and indicators that can be processed is largely enriched. Taken together, advanced machine learning algorithm allows us to establish a more optimal model that performed better in comparison to the conventional generalized linear models. By applying such models, physicians, and care givers could be alerted by the time when ICU patients are complicated with postoperative sepsis, thereby employing efficient yet personalized therapeutic strategies. Although the effectiveness of the XGBoost model had been validated in our study, the model was based on a single center retrospective database. Thus, further prospective cohort studies are required to evaluate the uniformity of this model. Our results revealed that complication of coagulopathy and coagulation profile at ICU admission, including platelet counts, PT, APTT, and INR, were associated with increased in-hospital death among patients with postoperative sepsis. The occurrence of coagulopathy was commonly seen in septic patients, which was closely related to organ dysfunction and poor outcomes (25,26). The activation of monocytes and endothelial cells was mainly characterized in the early phase of sepsis and resulted in massive exposure to tissue factors, thereby contributing to the over activation of coagulation and subsequent thrombin generation (27). Concomitantly, anticoagulant pathways, such as protein C system, were impaired by overexpression of proinflammatory cytokines (27). The imbalance between coagulation and anticoagulant pathways can be further augmented by surgical insults, and it leads to the upregulation of plasminogen activator inhibitor and subsequent hyperfibrinolysis (28,29). Coagulation abnormalities have been reported to induce formation of microvascular clots and disseminated intravascular coagulation (DIC), further resulting in tissue ischemia and organ dysfunction (30,31). Of note, majority of patients in our study had been exposed with cardiovascular surgery. Some of those patients might frequently receive anticoagulant agents, which could add to coagulation abnormalities. Early implementation of rotational thromboelastometry (ROTEM) and thrombelastography (TEG) appears to be beneficial for patients with postoperative sepsis  who are at high risk of death (32,33). As documented in large randomized controlled trials (RCTs), the administration of either antithrombin III or human recombinant thrombomodulin could improve short-term mortality among septic patients, but no trails specifically targeted patients with postoperative sepsis (34,35). The results of our study suggested that secondary analyses of previously published RCTs and future large trails were both favorable for better recognition and treatment of septic patients following major operations. In addition, our models identified that fluid-electrolyte disturbance, sodium and chloride levels were associated with the inhospital mortality, which could be explained by the deteriorative effects of acidosis on fibrin polymerization and clot integrity (28,36). From the present observation, we noticed that ICU patients underwent neurosurgery showed the highest inhospital mortality compared to those with other types of surgery. Meanwhile, it revealed that neurosurgery was a robust predictor of in-hospital death among patients with postoperative sepsis. Neurosurgical procedures might bring about severe complications, including intracerebral hemorrhage, brain edema, and cerebral ischemia, which showed serious impacts on clinical outcomes of neurosurgical patients (37). Furthermore, neurosurgical insults could affect hypothalamicpituitary-adrenal axis and hormonal generation, resulting in intractable immunosuppression (38). Therefore, well-performed neurocritical care is warranted for neurosurgical patients, especially for those with postoperative sepsis (39).

Limitations
There are some limitations to our study. Firstly, the current study was a single center retrospective analysis using publicly available database, which restricted us from identifying the causal relationship between variables and endpoints. Thus, prospective cohorts are needed for further validation. Secondly, there were several potential confounding variables that were unable to be assessed due to severe data missing and other reasons. However, some of the excluded variables might have predictive value for clinical outcomes. Thirdly, we employed XGBoost model, a machine learning-based algorithm that was not widely applied in clinical research. Although XGBoost had a significantly higher accuracy in predicting outcomes compared to generalized linear models, overfitting problem was inevitable. Given that, external validation was required to test its utility. Meanwhile, algorithm used other boosting strategy like Adaptive boosting (AdaBoost) were not tested in the current study, which might prevent us from developing more efficient model for predicting our endpoint. Finally, our study merely focused on the in-hospital mortality of patients with postoperative sepsis, while other outcomes, such as long-term mortality and readmission rates, were also important and needed further investigation.

CONCLUSIONS
In summary, these results suggest that some important features are potentially related to the in-hospital mortality among ICU patients with postoperative sepsis. The XGBoost model is capable of processing large amount of variables and further capturing these complicated relationships, which indeed performs better in mortality prediction compared to stepwise logistic model. Further validation of our model in external datasets can prompt us to early recognize patients with postoperative sepsis who are at high risk of death during hospitalization, and to implement timely yet efficient treatments.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: https://mimic.physionet.org.