A clinical prediction model based on interpretable machine learning algorithms for prolonged hospital stay in acute ischemic stroke patients: a real-world study

Objective Acute ischemic stroke (AIS) brings an increasingly heavier economic burden nowadays. Prolonged length of stay (LOS) is a vital factor in healthcare expenditures. The aim of this study was to predict prolonged LOS in AIS patients based on an interpretable machine learning algorithm. Methods We enrolled AIS patients in our hospital from August 2017 to July 2019, and divided them into the “prolonged LOS” group and the “no prolonged LOS” group. Prolonged LOS was defined as hospitalization for more than 7 days. The least absolute shrinkage and selection operator (LASSO) regression was applied to reduce the dimensionality of the data. We compared the predictive capacity of extended LOS in eight different machine learning algorithms. SHapley Additive exPlanations (SHAP) values were used to interpret the outcome, and the most optimal model was assessed by discrimination, calibration, and clinical utility. Results Prolonged LOS developed in 149 (22.0%) of the 677 eligible patients. In eight machine learning algorithms, prolonged LOS was best predicted by the Gaussian naive Bayes (GNB) model, which had a striking area under the curve (AUC) of 0.878 ± 0.007 in the training set and 0.857 ± 0.039 in the validation set. The variables sorted by the gap values showed that the strongest predictors were pneumonia, dysphagia, thrombectomy, and stroke severity. High net benefits were observed at 0%–76% threshold probabilities, while good agreement was found between the observed and predicted probabilities. Conclusions The model using the GNB algorithm proved excellent for predicting prolonged LOS in AIS patients. This simple model of prolonged hospitalization could help adjust policies and better utilize resources.


Introduction
With acute ischemic stroke (AIS) being the first leading cause of disability and the second leading cause of mortality worldwide, economic burden remains a prominent issue in clinical practice (1).Length of stay (LOS) is a vital factor of overwhelmed healthcare cost expenditures.Pellico-Lopez et al. (2) found that 15.8% of the total cost of stroke cases depended on the cost of prolonged stay.Reducing unnecessary hospital stays is important to relieve insurance stress, especially under the policy of diagnosis-related groups (DRGs) payment.Therefore, it is essential that the risk model of prolonged LOS be analyzed to relieve economic burden and optimize the discharge plan for patients with AIS.
The average LOS following stroke onset varied according to time and country.In the United States, the LOS for stroke hospitalizations decreased from 2004 to 2018, according to the data survey of 8 million stroke patients (unadjusted: 6.3 days in 2004 vs. 5.6 days in 2018; adjusted: 7.6 days in 2004 vs. 5.4 days in 2018) (3).A post-hoc analysis (4) based on information from multiple sources in China found that the median and IQR of LOS for AIS was 10.0 (7.0-13.0)days.Hao et al. (5) found that malnutrition estimated by the CONUT score on admission could increase LOS in elderly AIS patients.Moreover, Neale et al. (6) found that stroke patients receiving an early supported discharge model of care spent fewer days in hospital and incurred less cost.In addition, the mode of treatment could also be related to the LOS after a stroke.Intravenous tissue plasminogen activator (IV-tPA) was associated with an increase in LOS in stroke patients treated with endovascular treatment within 4.5 h (7).
Only a few articles have currently established risk models for predicting the length of hospital stay in stroke patients.Koton et al. (8) evaluated the performance of the prolonged length of stay (PLOS) score in the cohort of stroke, and concluded that the PLOS score could be clinically useful in different healthcare systems.However, they only included patients from 2002 to 2007, and the treatments for stroke have developed dramatically in recent years.Nowadays, artificial intelligence is able to deduce from voluminous datasets and to incorporate nonlinear interactions among a large set of predictors (9)(10)(11).For machine learning predicting prolonged LOS in AIS, Kurtz et al. (12) accurately predicted the LOS of patients admitted to the ICU with stroke through machine learning methods, but they did not include stroke-specific data, such as the National Institutes of Health Stroke Scale (NIHSS) score or neuroimaging findings.Yang et al. (13) found that the artificial neural network model achieved adequate discriminative power for predicting prolonged LOS after AIS and identified crucial factors associated with a prolonged hospital stay.However, they did not include pneumonia or another important onset symptom of stroke, which proved to be strong influencing factors of LOS in AIS patients.
As a result, we set out to gather extensive stroke-specific data and create a scientific risk model based on an interpretable machine learning algorithm to predict prolonged hospital LOS in AIS patients.This simple model of prolonged hospitalization could help adjust policies and better utilize resources.

Participant selection
This study continuously enrolled AIS patients who were admitted to the Department of Neurology at the Second Affiliated Hospital of Xuzhou Medical University between August 2017 and July 2019 (Figure 1).The inclusion criteria were as follows: (1) age ≥ 18 years; (2) a diagnosis of AIS (14,15)and within 24 h of onset (16,17).The exclusion criteria were as follows: (1) patients who needed to be transferred from one department (or hospital) to another; (2) patients who had inhospital strokes; (3) patients who had transient ischemic attack; and (4) patients who were unable to extract complete data.This flowchart indicated that our hospital managed about a total of 1,354 patients from August 2017 and July 2019, of whom 745 (55%) AIS participants had complete data (Figure 1).Of these 745 patients, 68 patients were those who needed to be transferred from one department (or hospital) to another/those who had inhospital strokes, leaving a final cohort of 677 patients.Retrospective review of medical health records for this study was approved by our Institutional Review Board.Owing to the retrospective nature of this study, written informed consent was waived (Number: 2020081603).Moreover, the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) statements were followed for all data analysis and reporting (18).

Data collection and definitions
The primary outcome was the prediction of a prolonged LOS for AIS patients, which was defined as more than 7 days of hospitalization.The LOS was measured from the admission day to the death or discharge day.This definition was similar to previous studies on LOS in stroke patients (8,19,20).The main clinical data included the following categories: baseline demographics, clinical features, and laboratory data.For baseline demographics, systolic blood pressure (SBP) and diastolic blood pressure (DBP) were tested on the right hand and extracted from the nursing record sheet on admission.For clinical features, stroke severity was divided into "mild" (NIHSS score < 8) and "moderate to severe" (NIHSS score ≥9), which was similar to previous clinical trials (21)(22)(23).Sato et al. (22) found that the optimal cutoff score of the baseline NIHSS for the favorable outcome was 8 for patients with anterior circulation stroke (sensitivity, 80%; specificity, 82%).The pneumonia in our study referred to those with development of pneumonia within 72 h after hospitalization (24).We diagnosed pneumonia by the CDC criteria because it was the most commonly used (25).The dysphagia was defined as abnormal swallowing physiology of the upper aerodigestive tract and as detected from clinician testing including screening, clinical bedside, or instrumental tests (26).The thrombolysis, thrombectomy, antiplatelets, anticoagulation, statins, and proton pump inhibitors were also collected from medical records.Treatment methods for AIS were followed by the 2019 American Heart Association/ American Stroke Association (AHA/ASA) guideline (27).For laboratory data, they were extracted from blood test results on admission.

Machine learning algorithm and data analysis
Continuous data were presented as median and interquartile range (IQR), and the Mann-Whitney U-test was used for statistical comparison between two groups.Categorical data were described as proportions, and the chi-squared or Fisher's exact test was used for comparison between two groups.The least absolute shrinkage and selection operator (LASSO) regression was applied to reduce the dimensionality of the data.In total, we utilized eight different machine learning algorithms, including the extreme gradient boosting (XGB) classifier, logistic regression, the light gradient boosting machine (LGBM) classifier, the AdaBoost classifier, Gaussian naive Bayes (GNB), complement naive Bayes (Complement NB), the multilayered perceptron (MLP) classifier, and the support vector (SVC) classifier.The hyperparameter settings for eight different machine learning algorithms used in our study are listed in Supplementary Table 1.For the XGB classifier, learning rate was set as 0.001, and the reg lambda was 0.01.Max depth and min child weight were set as 2. The area under the receiver operating characteristic (ROC) curve of the model was calculated by 10 bootstrapping resamples.For each bootstrap resample, the validation set (135 cases) accounted for 20% of the total sample, and the training set (542 cases) accounted for 80% of the total sample.After selecting the best model classifiers for this dataset, we exploited SHapley Additive exPlanations (SHAP) values to interpret the outcomes of the classifiers, which was a unified approach that connected cooperative game theory with local explanations to explain the output of any machine learning model.In addition, the decision curve analysis (DCA) was Flowchart of inclusion and exclusion of study patients.This flowchart indicated that our hospital managed about total 1,354 patients from August 2017 and July 2019, of which 745 (55%) AIS participants had complete data (Figure 1).Of these 745 patients, 68 patients were those who needed to be transferred from one department (or hospital) to another/those who had in-hospital strokes, leaving a final cohort of 677 patients.Abbreviation: LOS, length of stay.
applied to present the net benefits at various threshold probabilities.A calibration plot was used to investigate the degree of agreement between two groups.

Patient characteristics
A total of 677 patients remained for evaluation of the machine learning algorithms to predict prolonged LOS in AIS patients, among whom prolonged LOS was detected in 22.0% (n = 149).The average of LOS in all 677 participants was 10.78 ± 4.69 days.The baseline and clinical characteristics between the two groups are compared in Table 1.Longer LOS was linked to elevated levels of brain natriuretic peptide (BNP), S100-b, and neuron-specific enolase (NSE).Moreover, the prolonged LOS group was more likely to suffer from dysphagia, pneumonia, and a moderate-tosevere stroke.As for treatment, the prolonged LOS group had more frequent use of thrombolysis, thrombectomy, anticoagulation, and proton pump inhibitors (PPIs).Then, least absolute shrinkage and selection operator (LASSO) regression was used to reduce the number of factors with an optimal l of 0.002.The candidate characteristics were narrowed down to the following 28 features with nonzero coefficients: age, gender, diastolic blood pressure, anterior or posterior stroke, side of hemisphere, stroke lesion, single or multiple lesions, cholesterol, triglyceride, low-density lipoprotein (LDL), glycosylated hemoglobin (HbA1c), homocysteine (HCY), uric acid (UA), myoglobin (MB), and fibrinogen.The coefficients of characteristics selected by LASSO regression are illustrated in Figure 2.

Development and validation of models
As shown in Table 2, the GNB model with all characteristics had a striking AUROC of 0.878 ± 0.007 in the training set and 0.857 ± 0.039 in the validation set, while the other seven representative models had the highest AUROC of 0.875 ± 0.014 in the training set and 0.837 ± 0.031 in the validation set.For the GNB model, the sensitivities were 0.818 (training sets) and 0.804 (validation sets), while the specificities were 0.814 (training sets) and 0.816 (validation sets).The cross-reference between the full names and abbreviations in our manuscript is shown in Supplementary Table 2.The forest plot of each AUROC of eight models is depicted in Figure 3. Figures 4A, B present the comparison of AUROC between the GNB model and the other seven models, respectively, in the training and validation sets.The learning curve of the GNB model is displayed in Figure 5. Obviously, the GNB model significantly outperformed the other seven models in both

SHAP values depending on variables
The SHAP values for the GNB model and the importance of the variables sorted by the gap values are shown in Figures 6A, B. Red bars indicated an increase in the probability of prolonged LOS, whereas blue bars demonstrated a decrease in the probability of prolonged LOS for AIS patients.As Figure 6B shows, pneumonia, dysphagia, thrombectomy, and stroke severity all substantially increased the probability of prolonged LOS.In addition, we performed a decision curve analysis (Figure 7A) and a calibration plot (Figure 7B) to illustrate the performance of the GNB model.High net benefits could be observed in 0%-76% threshold probabilities, while good agreement could be found between the observed and predicted probabilities of prolonged LOS.

Discussion
This study generated a simple clinical risk model that can be used to determine patients at increased risk of prolonged LOS.Our risk model had a promising AUC of 0.878 and 0.857 in the training  and validation sets, respectively.The main outcomes of the current study were that pneumonia, dysphagia, thrombectomy, and stroke severity were the strongest clinical parameters for prolonged LOS following AIS after recursive feature elimination.Moreover, the artificial intelligence algorithms developed by these parameters showed excellent model performance on discrimination, calibration, and decision curve analysis.The strengths of our clinical risk score included the use of simple demographic and common biochemical parameters, and we collected enough candidate variables to develop this model.To our knowledge, this is the first study to predict prolonged LOS for common AIS patients based on an interpretable machine learning algorithm.The difference from previous studies was that we developed an integrated machine learning model with high performance, which could help adjust the policies to better utilize resources, especially under the DRG payment policy and the increasingly serious aging problem in the global world.Su et al. ( 28) included 129,444 patients with AIS and found that the inpatient cost was $1,020 ($742-$1,545) in China.In an attempt to decrease patients' risk of prolonged LOS following AIS, previous retrospective studies have identified some factors.Many studies define prolonged LOS as more than 7 days (8,19,20).However, when it comes to patients with severe strokes or those admitted to an intensive care unit, some studies define it as more than 30 days (12,29).Common factors affecting stroke hospitalization duration included quality of care, hospital-acquired infection, stroke severity and type, level of consciousness, history of heart failure and atrial fibrillation, and receiving reperfusion therapy (19,(29)(30)(31)(32)(33).Interestingly, during adolescence, low stress resilience, underweight, and higher systolic blood pressure were associated The forest plot of the each AUROC of eight models.AUROC, area under the receiver operating characteristic curve; XGB, extreme gradient boosting; LGBM, light gradient boosting machine; GNB, Gaussian naive Bayes; CNB, complement naive Bayes; MLP, multilayered perceptron; SVM, support vector machine.

A B
The comparison of AUROC between the GNB model and the other seven models.with longer hospital stays in AIS, with adjusted relative hazard ratios of 1.46, 1.41, and 1.01, respectively (34), whereas these prior studies did not show the weight of each parameter on the probability of prolonged LOS.An interpretable machine learning algorithm has the ability to analyze big datasets with high accuracy through automated analysis of non-linear relationships between numerous variables (35).Machine learning algorithms apply various statistical methods from past experience to select useful patterns in large and complex datasets, which involves extreme gradient boosting (XGB) classifier, GNB, SVC classifier, and so on (36).Raizada et al. (37) concluded the advantages and limitations of different algorithms and found that GNB produced results that were statistically robust and were replicates across two independent datasets.An additional advantage of GNB classifiers was that GNB produced an accuracy similar to more sophisticated classifiers but with a substantial gain in speed (38).Therefore, we selected the GNB model from eight different machine learning algorithms that showed excellent performance in predicting prolonged LOS in AIS patients.
In this study, pneumonia, dysphagia, thrombectomy, and stroke severity were the leading clinical parameters in our interpretable machine learning algorithm.Pneumonia is an early complication of stroke and usually leads to prolonged LOS.The prevalence of pneumonia in patients with dysphagia after stroke was reported to

A B
The SHAP values for the GNB model and the importance ranking of the variables.The learning curve of the GNB model.
range from 7% to 33%, and the prevalence of dysphagia has been reported as between 28% and 65% (39,40).Aspiration without a cough, known as "silent aspiration," further increased the incidence of pneumonia to 54% (40).A systematic review of stroke-associated pneumonia reported that the overall incidence of pneumonia ranged from 0% to 23.6% (41), which was a little lower than the incidence in our study.In our study, the incidence of pneumonia in all participants is 24.37%.It may be because of the varied definitions and diagnosis criteria of stroke-associated pneumonia.The Centers for Disease Control and Prevention (CDC) criteria (25), the PISCES SAP diagnostic criteria (42), and the combination of the clinical symptoms and auxiliary examination results criteria were all used to diagnose stroke-associated pneumonia in previous studies (41).In our study, we diagnosed pneumonia by the CDC criteria because it was the most commonly used, using clinical (lung auscultation and percussion, presence of fever, and purulent tracheal secretion), microbiological (tracheal specimens and blood cultures), and chest radiography findings.For dysphagia, the incidence in all participants was 22.45%, while in the "prolonged LOS group", it was 59.73%, and in the "no prolonged LOS group", it was 11.93% (Table 1).The incidence of dysphagia varied greatly between studies (ranged from 20% to 80%), depending on the definition of dysphagia, which can range from failing a dysphagia screen, to prescribed diet modifications, to measures of physiology on an instrumented swallowing study (26,41,43).Ogawa et al. (40) found that patients who underwent a flexible endoscopic evaluation of swallowing and received optimal nutritional intervention were more likely to have a shorter hospital stay (p = 0.005).The complications of dysphagia include the consequences of modifications to dietary intake: compromised nutrition and hydration, prolonged LOS, and reduced quality of life.As a result, the optimal treatments and measures for dysphagia should be performed.Many studies have investigated a variety of interventions, including therapist-delivered, behavioral, acupuncture, and electrical or magnetic stimulation to treat dysphagia (39).As for stroke severity, it was the most consistent factor among the factors contributing to LOS in AIS patients, and those who received reperfusion therapy were more likely to have prolonged LOS, which was similar to the previous study (29).Patients with more severe strokes may require more intensive medical care, including medication treatment and rehabilitation.Thrombectomy is a procedure used to remove a blood clot from a blood vessel, and is typically used in the treatment of acute ischemic stroke.While thrombectomy can be effective in reducing the severity of stroke and improving patient outcomes, it is also a relatively invasive procedure that can carry some risks and complications.As a result, patients who undergo thrombectomy may require longer hospital stays than those who do not.In summary, both thrombectomy and stroke severity are independent risk factors for prolonged LOS following AIS.
Our study has several limitations.First, its retrospective study design and only including patients from one single tertiary central hospital may limit the generalizability of the machine learning algorithm in clinical practice.Second, owing to the availability of the data, we were not able to consider more detailed factors, such as specific steps of reperfusion therapy, infarction or penumbra volume, and the collateral circulation status.More valuable and dynamic predictors could improve the performance.Third, some special reasons that might affect hospitalization time, such as economic stress or medical disputes, were not analyzed.Fourth, the sample size and certain bias limited the predictive ability of the model.We just internally validated our interpretable machine learning algorithms by bootstrap resample and multi-center largesample studies are warranted to verify this conclusion in the future.

Conclusion
We developed a model for predicting the prolonged LOS for AIS patients using the GNB algorithm.This model included 20 potential clinical factors and performed well in terms of discrimination, calibration, and clinical utility, but it needs to be validated in larger multicenter cohorts.In this model, pneumonia, dysphagia, thrombectomy, and stroke severity might be strong predictors of prolonged LOS.We explained these main variables A B The decision curve analysis and calibration plot to illustrate the performance of the GNB model.and analyzed the effects of their changing trends on prolonged LOS.Timely prevention and intervention for complications, as well as high quality standard of care, may be prospects worthy of clinicians' promising efforts.

FIGURE 1
FIGURE 1 the training and validation sets.Despite the narrow gap, De Long's test showed that the difference between the GNB and XGB model remained significant (p = 0.04).
(A) The comparison of AUROC between the GNB model and the other seven models in the training sets.(B) The comparison of AUROC between the GNB model and the other seven models in the validation sets.ROC, area under the receiver operating characteristic curve; XGB, extreme gradient boosting; LGBM, light gradient boosting machine; GNB, Gaussian naive Bayes; CNB, complement naive Bayes; MLP, multilayered perceptron; SVM, support vector machine.

FIGURE 5
FIGURE 5 (A) The decision curve analysis for the GNB model.(B) Calibration plot for the GNB model.GNB, Gaussian naive Bayes.

TABLE 1
The baseline and clinical characteristics in prolonged LOS patients and no prolonged LOS patients.

TABLE 2
The predictive capacity of eight different machine learning algorithms.