Mechanical Learning for Prediction of Sepsis-Associated Encephalopathy

Objective: The study aims to develop a mechanical learning model as a predictive model for predicting the appearance of sepsis-associated encephalopathy (SAE). Materials and Methods: The prediction model was developed in a primary cohort of 2,028 sepsis patients from June 2001 to October 2012, retrieved from the Medical Information Mart for Intensive Care (MIMIC III) database. Least absolute shrinkage and selection operator (LASSO) regression model was used for data dimension reduction and feature selection. The model was developed using multivariable logistic regression analysis. The performance of the nomogram has been evaluated in terms of calibration, discrimination, and clinical utility. Results: There were nine particular features in septic patients that were significantly associated with SAE. Predictors of individualized prediction nomograms included age, rapid sequential evaluation of organ failure (qSOFA), and drugs including carbapenem antibiotics, quinolone antibiotics, steroids, midazolam, H2-antagonist, diphenhydramine hydrochloride, and heparin sodium injection. The area under the curve (AUC) was 0.743, indicating good discrimination. The prediction model showed calibration curves with minor deviations from the ideal predictions. Decision curve analysis (DCA) suggested that the nomogram was clinically useful. Conclusion: We propose a nomogram for the individualized prediction of SAE with satisfactory performance and clinical utility, which could aid the clinician in the early detection and management of SAE.


INTRODUCTION
Sepsis-associated encephalopathy (SAE) is defined as diffuse brain dysfunction caused by a dysregulated host response without central nervous system (CNS) infection (Gofton and Young, 2012). Symptoms and signs range from mild inattentiveness or disorientation, agitation, and hypersomnolence to more severe disturbance of consciousness and coma (Chung et al., 2020).
Approximately 70% of the patients with bacteremia display neurological symptoms or signs ranging from lethargy to coma (Peidaee et al., 2018). SAE is associated with increased mortality, prolonged hospitalizations, and inpatient costs. It is also associated with higher severity on the Glasgow Coma Scale (GCS), sequential organ failure assessment score (SOFA), quick sequential organ failure assessment (qSOFA), the simplified acute physiology score (APACHE II) of patients followed by persistent cognitive and functional impairments (Iwashyna et al., 2010;Sonneville et al., 2017). Iwashyna et al. (2010) found that up to 70% of sepsis survivors exhibited lasting neurological impairment, including alterations in mood, cognition, and motor function, and up to 45% had neurocognitive impairments 1 year later. With a mortality rate of up to 63% (Eidelman et al., 1996), and morbidities mentioned above, SAE can have a major effect on the healthcare system, the economy, and the society.
Sepsis-associated encephalopathy is a diagnosis of exclusion. It is diagnosed in the absence of direct infection of the central nervous system, multi-organ failure, traumatic brain injury, fat embolism, and ingestion of illicit drugs (Iacobone et al., 2009). Due to sepsis complications and the lack of an early diagnosis system, diagnosis and management of SAE are often delayed, leading to significant morbidity and mortality. Early diagnosis and treatment for brain injury are crucial for the survival and prognosis of sepsis patients. Sonneville et al. (2017) reported that acute renal failure, hypoglycemia (< 3 mmol/L), hyperglycemia (> 10 mmol/L), hypercapnia (> 45 mmHg), hypernatremia (> 145 mmol/L), and Staphylococcus aureus infection were associated with the development of SAE. Recently, Yang et al. (2020) developed a nomogram to forecast mortality in patients with known SAE. However, to the best of our knowledge, there is currently no prediction model for the diagnosis of SAE. This study is the first attempt to establish a predictive nomogram for SAE, based on the sociodemographic and clinical data of 2,535 sepsis patients, to allow for individualized screening for SAE among septic patients.
The study aims to identify early and potential risk factors for SAE by a retrospective analysis of a large clinical database, and establish a comprehensive prediction model for SAE patients. The proposed nomogram can assist in clinical decisionmaking and identify sepsis patients at high risk for SAE, who should undergo further investigative tests, thus promoting early diagnosis and management of SAE.

Data Source
Medical Information Mart for Intensive Care (MIMIC III) between June 2001 and October 2012 was employed for this study. MIMIC III was approved by the Institutional Abbreviations: SAE, sepsis-associated encephalopathy; SAPS II, simplified acute physiology score; SOFA, sequential organ failure assessment; qSOFA, quick sequential organ failure assessment; GCS, Glasgow coma scale; ICU, intensive care unit; MIMIC III, Medical Information Mart for Intensive Care III; ICD-9, International Classification of Diseases, Ninth Revision; ROC, Receiver operating characteristic curve.
Review Boards of Beth Israel Deaconess Medical Center and Massachusetts Institute of Technology. There was no requirement for individual patient consent because anonymized health information was used. This is a publicly accessible singlecenter critical care database containing longitudinal data on 46,520 patients admitted to the ICU. The raw data were extracted using a structure query language (SQL) with Navicat, and further processed with R software.

Sepsis-Associated Encephalopathy
We defined SAE in the study as sepsis with a GCS < 15 on the first day of ICU admission, delirium, cognitive impairment, altered mental status according to the ICD-9 code, and medicating with haloperidol. Altered consciousness caused by other reasons was excluded. GCS has been established as a clinically effective tool for characterizing SAE and distinguishing it from sepsis (Iwashyna et al., 2010). For sedated, postoperative patients or tracheal intubation, ventilator-assisted breathing, their GCS score were extracted before they were sedated.

Data Extraction and Management
R statistical software (R foundation for statistical computing, Vienna, Austria) was used to retrieve patient information from the MIMIC III database. The following basic patient data were collected from each patient, including age, sex, admission type, marital status, and mean value of vital signs during the first 24 h of ICU stay, including heart rate, systolic blood pressure, diastolic blood pressure, respiratory rate, and temperature. Since ICU admission, the first laboratory data include alanine aminotransferase, aspartate aminotransferase, partial thromboplastin time, white blood cell count, lymphocyte, neutrophil, monocytes, eosinophils, hemoglobin, platelet, blood urea nitrogen, creatinine, and glucose. SAPS II, qSOFA score, SOFA score assessment of the severity of illness and organ failure on the first day of ICU admission, and GCS were also recorded. Comorbidity as coded and defined in the ICD-9 code (Supplementary Material 9); site of infection (Supplementary Material 10); organ failure (Supplementary Material 11) and ICU stay time, hospital mortality were collected in the study.

Statistical Analysis
Data distribution was analyzed using the Shapiro-Wilk test. Continuous variables were expressed as the mean ± standard deviation (SD) or the median (interquartile range, IQR); categorical variables were expressed as frequency and percentage. A non-parametric test (Mann-Whitney U test or Kruskal-Wallis test) was used for data with non-normal distribution or heterogeneity of variances. Categorical data were compared using the Pearson Chi-squared test.
Least absolute shrinkage and selection operator (LASSO) regression model was used for data dimension reduction and feature selection (Training set). A nomogram was constructed according to the multivariate logistic regression analysis results (Training set), and it was internally validated using a 1,000 bootstrap resampling procedure (Validation set). The performance of the nomogram was assessed using discrimination and calibration (Validation set). The proposed nomogram's discrimination ability was quantified with a receiver operating characteristic (ROC) curve analysis and the AUC. The calibration was carried out by plotting the calibration curve to analyze the association between the observed incidence and the predicted probability. Decision curve analysis was performed to assess the clinical utility of the nomogram (Training set and Validation set). Statistical analysis was conducted with R software (version 3.4.3). Statistical significance was defined as p < 0.05.

Comparison of Baseline Patient Characteristics Between SAE and Non-SAE Groups and Between Primary and Validation Cohorts
A total of 2,535 patients met the inclusion criteria for the study. About 80% of the patients were randomly assigned to the primary cohort, and 20% of the patients were randomly assigned to the validation cohort. About 41.6% of patients with sepsis-associated encephalopathy were detected. A matrix diagram of missing data is shown in the Data Profiling Report (Supplementary Material 12). We replaced any missing values of the included variables with their mean values. The recruitment process is shown in Figure 1.
Patient characteristics in the primary and validation cohorts are given in Table 1. There were no significant differences between the two cohorts in SAE (P = 0.763), where the SAE patients were 41.5 and 42.2% in the primary and validation cohorts, respectively, and there were no significant differences in the clinical characteristics between the cohorts, which justified their use as training and validation cohorts.  Patients in the SAE group were older than those in the non-SAE group in the primary cohort and validation cohort. More SAE patients used statins, beta-blockers, H 2antagonist, proton pump inhibitor, steroids, non-steroidal antiinflammatory drugs (NSAIDs), aspirin, clopidogrel, and sodium bicarbonate compared to the non-SAE group. There were no significant differences in gender, admission type, marital status, and comorbidity between the SAE and non-SAE patients.

Feature Selection and SAE Signature Building
In the primary cohort (Figure 2), 89 features were reduced to nine potential predictors of texture features of 2,028 patients. These features are shown in the SAE-score calculation formula (Supplementary Material 13; Data supplement).

Development of an Individualized Prediction Model
A LASSO logistic regression analysis identified age, qSOFA, quinolone antibiotics, carbapenem antibiotics, midazolam, diphenhydramine hydrochloride, heparin sodium injection, steroids, and H 2 -antagonist as independent predictors (Supplementary Material 13). The nomogram included all the significant independent factors of the logistic regression model in the training cohort. It established scoring criteria according to the odds ratio (OR) values of risk factors and gave a score for each level of prognostic factors. Through summation of the scores associated with each variable and projection of the total scores to the bottom scale, probabilities could be estimated for SAE, and it was possible to effectively predict SAE according to the individual characteristics of the patient. The diagnostic nomogram for SAE is shown in Figure 3.

Discrimination and Calibration
To evaluate the calibration of the model, the study used internal validation with the 1,000 bootstrap resampling method as shown in Figure 4. The calibration plot of current depression rates suggests good agreement between the observed and predicted values. We used the ROC curve to evaluate the discrimination capability of the model. The area under the curve (AUC) of the nomogram was 0.743 (95% CI: 0.720-0.766). The predictive SAE of the model's sensitivity was 0.585 and specificity was 0.879. A cut-off was 0.435 calculated by Youden's index (Figure 5).

Clinical Utility
The decision curve analysis (DCA) for the SAE nomogram is in Figure 6. According to the DCA, the SAE model of net benefit had threshold probabilities ranging from 10 to 90% in the primary cohort. The decision curve showed that if the threshold probability was between 10% and 90%, using the SAE nomogram to predict SAE would be of more benefit to predict SAE compared to not utilizing the nomogram. In the validation set, the SAE model of net benefit had threshold probabilities ranging from 10 to 89%, and thus the model was beneficial in its prediction of SAE.

DISCUSSION
The study observed that 41.6% of sepsis patients with SAE were identified during admission to the ICU. A hospital mortality rate of up to 64.7% was observed in patients with SAE. We identified that clinically relevant risk factors for SAE, including age, qSOFA score, and medications such as carbapenem antibiotics, quinolones antibiotics, steroids, midazolam, H 2 -antagonist, diphenhydramine hydrochloride, and heparin sodium injection, had a significant impact on the occurrence of SAE. The study has established a comprehensive visual prediction model which can provide a probabilistic estimate of SAE at the earlier stages in individual sepsis patients. Furthermore, the nomogram showed satisfactory validity, discrimination, and clinical utility.
In this study, 41.6% of sepsis patients suffered from SAE. Previous studies have published the rates of SAE in patients with sepsis ranging from 8 to 70% (Bartynski et al., 2006). This could be the result of different diagnostic criteria. Feng et al. (2019) reported a 42.3% incidence of SAE in septic patients, whereas Yang et al. (2020) reported 50% incidence. This study result is consistent with their study results.
Our cohort study showed that SAE patients had higher SOFA, qSOFA, and APACHE II when compared to non-SAE patients and also a high hospital mortality rate of 64.7%. It shows that SAE patients with more severe organ dysfunction are associated with an increased risk of mortality and the related adverse clinical outcomes. The result is consistent with Yang et al.'s study. SAE patients presented significantly high APACHE II, SOFA scores, and 30-day mortality in a recent retrospective analysis involving more than 2,400 SAE patients (Yang et al., 2020). Feng et al. (2019) demonstrated that the incidence of 28-day mortality was 45.95% and 180-day mortality was 55.41%, and the multivariate stepwise regression analysis demonstrated that the risk of death in the SAE group was significantly higher than in the non-SAE group and that SAE was a risk factor for sepsis-related death (OR = 2.868). These results are consistent with our findings.
We identified clinical and potential risk factors for SAE, which confirms that SAE patients were older and had urinary system infection when compared to non-SAE patients. Sonneville et al. (2017) showed that compared to the non-SAE group, the SAE group included patients who were significantly older in age. Presence of comorbid urinary system infection in patients with delirium had been confirmed by many studies (Chae and Miller, 2015;Carson et al., 2017). Previous studies have also reported that urinary system infection is a risk factor for delirium (Gau et al., 2009;Dahl et al., 2010), urinary tract infections (UTI) increase a subject's risk of developing delirium, or urinary system infection is a "common cause" of delirium (Lerner et al., 1997; FIGURE 3 | Nomograms for the prediction of the incidence of SAE in patients with sepsis. To use the model, firstly, the position of each variable on the corresponding axis should be determined. Secondly, draw a line to the points axis for the number of points, and then add the points from all the variables. Thirdly, draw a line from the total points axis to determine the incidence of SAE at the lower line of the nomogram. The total points projected to the bottom scale indicate the percentage probability of the incidence of SAE. *** < 0.001, ** < 0.05. Kamel, 2005). The potential mechanisms of the association between urinary system infection and neuropsychiatric disorders are induced by antibiotic treatment for urinary system infection. The antibiotics that were most frequently implicated were macrolides and fluoroquinolones. Mostafa et al. reported 15 cases of antibiotic-induced psychosis during treatment of a urinary system infection, with 60% of the cases determined to be "highly suggestive" of a causal relationship between antibiotic usage and psychosis (Mostafa and Miller, 2014). Contrary to previously published data (Zhang et al., 2012;Sonneville et al., 2017), we did not find any microorganism as the pathogen associated with SAE. It is attributed to different data sources.
The results of our cohort study show that medications were a critical risk factor for SAE. The most significant impact is from the use of antibiotics, followed by analgesics, sedative drugs, and other drugs. The study demonstrated that more SAE patients used antivirals, cephalosporins, penicillin, antifungal, macrolides, aminoglycosides, quinolones, carbapenem in antibiotic, quinolones, and carbapenem are associated with SAE by filter variables with the LASSO method. This method surpasses the method of choosing predictors based on the strength of their univariable association with the outcome. In the study, the use of quinolones and carbapenem tended to be associated with SAE, in line with the study by Sonneville et al. (2017) and recent reviews highlighting their neurotoxicity (Bhattacharyya et al., 2016). Previous studies have suggested that the pathophysiology of quinolones and carbapenem-associated encephalopathy is associated with a disturbance of gammaaminobutyric acid-ergic (GABAergic) interneurons. They affect the central nervous system mainly by inhibiting GABA receptors interfering with inhibitory neurotransmission and enhancing bursts of excitatory neurons, which is concentration-dependent (Hori and Shimada, 1993;Munoz-Gomez et al., 2015). Our results also suggested that the use of propofol, midazolam, and opioids was found to be more likely observed in SAE patients. Further analysis found that midazolam is found to be a risk factor with SAE patients. It is widely accepted that midazolam is an independent risk factor for delirium in critically ill patients. In a large population-based cohort study, Zaal et al. found that the risk of delirium occurrence in critically ill adults is related FIGURE 4 | Calibration curves of a nomogram estimating the incidence of SAE in sepsis patients. Predicted and observed SAE rates are plotted as the logistic calibration. The y-axis represents the actual SAE occurrence rate. The x-axis represents the predicted SAE occurrence risk. The diagonal dotted line represents a perfect prediction by an ideal model. The blue solid line represents the performance of the nomogram, of which a closer fit to the diagonal dotted line represents a better prediction. to benzodiazepine use. A daily dose of only 5 mg of midazolam administered to a coma-and delirium-free patient increased the odds of this patient developing delirium in the following day by 4%. It supports the study results. A large number of SAE patients used vasopressors. On the one hand, SAE patients were found to be more critically ill and had a higher incidence of circulatory failure. On the other hand, the use of vasopressor was found to be a risk factor for SAE, although our further analysis has not been confirmed. Vasopressor use is a known risk factor for long-term cognitive impairment after critical illness. However, the specific mechanisms of these factors cannot be individually determined by our data, and this question requires further research.
Besides, our cohort study demonstrated that the use of H 2 -antagonist, steroids, and heparin sodium injection were risk factors for SAE. It has been reported by many studies that H 2 -antagonist and steroids cause delirium (Nguyen et al., 2011;Mauran et al., 2016). Tawadrous et al. (2014) demonstrated that compared to a lower dose, initiation of the current standard dose of histamine 2 receptor antagonists (H2RA) in older adults is associated with a small absolute increase in the 30-day risk of altered mental status. Yamada et al. (2018) found that steroid use was the determinant of progression to delirium in an intensive care unit, and research by Romain Sonneville et al. also found that steroid use was an independent risk factor for SAE (Yamada et al., 2018). These studies supported our study results. An interesting finding in the study was that heparin sodium injection and diphenhydramine hydrochloride were found to be risk factors for SAE. The result is consistent with several other studies (Shobugawa et al., 2007;Rothberg et al., 2013;Bidaki et al., 2017). However, the study for the first time demonstrated a strong association between heparin sodium injection, diphenhydramine hydrochloride, and SAE in a large cohort of sepsis patients. It is proposed that we pay attention to monitoring the mental changes in patients when using the above drugs.
The lack of validated predictive tools for early-stage SAE in sepsis patients and the equivocal efficacy of SAE interventions prompted us to develop a novel predictive modeling system using the nomogram methodology. For the construction of the clinical features and risk factors, 89 candidate features were reduced to nine potential predictors (carbapenem antibiotics, quinolones antibiotics, steroids, midazolam, H 2 -antagonist, diphenhydramine hydrochloride, and heparin sodium injection) by the LASSO method. The nine potential predictors established a comprehensive visual nomogram for predicting SAE patients. The nomogram demonstrated adequate discrimination in the primary cohort (AUC, 0.743; 95% CI: 0.720-0.766), which surprisingly improved in the validation cohort (AUC, 0.762; 95% CI: 0.716-0.807). We developed and validated a nomogram in the study, which could assess clinical variables. Both physicians and patients could perform an individualized prediction of the risk of SAE with this easy-to-use scoring system, which is in line with the current trend toward personalized medicine. The most important reason for using the nomogram is based on the need to interpret the individualized needs for additional treatment and improve patient outcomes. DCA was applied in this study. This novel method offers insight into the clinical consequences based on the threshold probability, from which the net benefit could be derived. The decision curve showed that if the threshold probability of a patient or doctor was > 10%, then using the SAE nomogram to predict SAE added more benefit compared to not using the SAE nomogram. The patient does not use the predictive SAE model without treatment for SAE, the net benefit is zero; Gray line: All patients used the predictive SAE model with effective treatment measures; Blue lines: If the SAE predictive model exceeds a threshold (ranging from approximately 10-90%), the patient needs to be treated immediately. For example, a patient would be treated for SAE if the probability was greater than 10%.
There were two limitations in the study. Firstly, the study was based on electronic MIMIC-III, whose data were generated during routine clinical practice. Thus, it is possible that the cohort selection is not exactly consistent with the definition of sepsis based on the guidelines, and neuroimaging data were also not included in the database. Besides, the study only conducted internal validation, and thus external validations are needed. Thus, the current nomogram can only provide a certain reference for SAE forecasts, and further modifications may be required once diagnostic methods are developed.

CONCLUSION
A nomogram was established for individualized prediction of SAE in sepsis patients and it showed satisfactory performance. It can be conveniently used in the clinical setting and may help physicians to identify SAE patients on time. It can also help physicians to take timely intervention measures to reduce the incidence of SAE and improve patient prognosis.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.