Multiple Machine Learning Approaches Based on Postoperative Prediction of Pulmonary Complications in Patients With Emergency Cerebral Hemorrhage Surgery

Objective This study aimed to create a prediction model of postoperative pulmonary complications for the patients with emergency cerebral hemorrhage surgery. Methods Patients with hemorrhage surgery who underwent cerebral hemorrhage surgery were included and divided into two groups: patients with or without pulmonary complications. Patient characteristics, previous history, laboratory tests, and interventions were collected. Univariate and multivariate logistic regressions were used to predict postoperative pulmonary infection. Multiple machine learning approaches have been used to compare their importance in predicting factors, namely K-nearest neighbor (KNN), stochastic gradient descent (SGD), support vector classification (SVC), random forest (RF), and logistics regression (LR), as they are the most successful and widely used models for clinical data. Results Three hundred and fifty four patients with emergency cerebral hemorrhage surgery between January 1, 2017 and December 31, 2020 were included in the study. 53.7% (190/354) of the patients developed postoperative pulmonary complications (PPC). Stepwise logistic regression analysis revealed four independent predictive factors associated with pulmonary complications, including current smoker, lymphocyte count, clotting time, and ASA score. In addition, the RF model had an ideal predictive performance. Conclusions According to our result, current smoker, lymphocyte count, clotting time, and ASA score were independent risks of pulmonary complications. Machine learning approaches can also provide more evidence in the prediction of pulmonary complications.


INTRODUCTION
Complications after major surgery occur frequently and are an important cause of mortality and morbidity, especially when they affect the lungs (1). Indeed, one in every seven patients who develops a so-called postoperative pulmonary complication (PPC) dies before hospital discharge and patients who survive often suffered from a sustained reduction in functional status (2). Early identification of patients at risk of developing PPCs could enable the use of preventive measures as well as timely treatment.
However, the current predictive indicators are very limited in severe craniocerebral surgery, especially cerebral hemorrhage (3,4). Patients with severe craniocerebral surgery often suffer from coma, lack of spontaneous breathing for a period of time, or need to be assisted breathing by the ventilator, and often combined with multiple severe multi-system symptoms. The incidence of PPC in emergency intracerebral hemorrhage (ICH) patients is much higher than that of conventional surgery, and the occurrence of complications often leads to poor prognosis, even directly related to patient death. However, there is a paucity of literature that investigates the deleterious effects of PPCs in neurosurgical patients, particularly in those requiring emergency ICH surgery which could face up to the highest rate of surgical complications rate. Therefore, we believe that better prediction of patients' PPC and taking preventive measures can greatly improve the prognosis of patients. In this study, the model of PPC in patients with ICH was established by multiple machine learning methods.

MATERIALS AND METHODS
The study was approved by our local institutional review board. The clinical data of patients who underwent emergency ICH surgery at a single institution during a 4-year period between January 1, 2017 and December 31, 2020 were reviewed and analyzed in a retrospective fashion. The characteristics of the patients included in this study were sex, age, education, medical history (coronary heart disease history, stroke history, hypertension history, and diabetes history), respiratory history, whether a current smoker, Glasgow coma scale (GCS), glucose, Albumin (Alb), WBC, lymphocyte count, leukocyte, RBC, platelet, clotting time, early enteral nutrition, preventive tracheotomy respirator use, operative time, anesthesia time, the blood loss, ASA classification, and craniotomy.
In accordance with past studies (1-7), these diagnoses were identified in critical care reports, radiographic reports, and/or the discharge summary. During the study period, Acute Respiratory Distress Syndrome (ARDS) was clinically diagnosed based on the American-European Consensus Conference on ARDS reported in 1994 (8). Outcome measures postoperative parameters included the presence of PPCs (defined as pulmonary edema, pneumonia, pneumothorax, pulmonary embolism, or ARDS). Patients who had developed PPCs during their hospital stay were compared to their non-PPC counterparts.
Statistical analysis using Student's t-test and one-way ANOVA was performed to determine characteristics that were statistically significantly different between the two groups. Pearson correlation analysis was performed for the risk factors and variables with P < 0.05 were deemed to have statistically significant associations. Variables with P < 0.05, as determined by univariate analysis, were included for multivariate analysis. Multivariate logistic regression analysis was employed to identify independent predictors of unfavorable outcomes.
A method that combines automatic algorithms and artificial selection aimed at dimension reduction was used for feature extraction from thousands of variables in this analysis. All features were selected by clinicians based on their experience in diagnosis before automatic analysis. The random forest algorithm was used for final extraction. According to the descending order of importance, the feature score higher than 0.0005 was selected for final analysis. Multiple algorithms were chosen to improve the probability of good discrimination performance. This study used the following classifiers: K-nearest neighbor (KNN), stochastic gradient descent (SGD), support vector classification (SVC), random forest (RF), and logistics regression (LR).
The whole data samples were randomly split into training and test sets according to a division of 7:3. Optimal features and hyperparameters combinations for the model were determined on the training set. Furthermore, 5-fold cross-validation (23) was used in the process of feature selection and hyperparameters (Figure 1).
The important indicators of the machine learning model include precision and recall. Precision refers to the actual positive samples among all predicted positive samples. The formula is as follows: Precision = TP/(FP + TP). Recall refers to the probability of being predicted to be a positive sample in all samples. Its formula is as follows: Recall rate = TP/(TP+FN). To consider the two factors, F1 score were calculated as F1 = 2 precision-recall rate/(precision + recall rate).
To assess the discriminative performance of this risk score in both the development and validation subsamples, we used the cstatistic, which was also displayed graphically as the area under the receiver operating characteristic (ROC) curve. An area under the ROC curve (AUC) of 0.5 indicates no discrimination, whereas an AUC of 1 indicates perfect discrimination.
The model was subsequently tested on the independent test set, which had not been seen by the model during the training process so as to avoid overfitting. To avoid bias due to the random split of the training and test sets, the above procedures were repeated 10 times, and the performance of different models was compared. The comparison of different models' performance in the 10 repeats was examined by Wilcoxon signed ranks test as suggested by a previous study (9,10). All continuous variables were normalized to the range of 0 to 1. Categorical variables were transformed into binary variables using one-hot encoding. Besides commonly used metrics such as AUC, we also reported results of the areas under the precision-recall curve, which is more informative on the imbalanced dataset. The four machine learning models were also compared.

RESULTS
The study included 354 patients with emergency cerebral hemorrhage surgery between January 1, 2017, and December 31, 2020. Furthermore, 53.7% (190/354) of the patients developed PPC during hospitalization. The mean age was 55.79 ± 14.31 years and the sex ratio was 71.1% in the PPC group; while the mean age was 54.77 ± 54.77 ± 18.49 years and the sex ratio was 79.9% in the non-PPC group (P > 0.05) ( Table 1). Univariate analysis showed that there were statistically significant differences in the current smoker, ASA classification, hypertension, glucose, Alb (g/dL), WBC, leukocyte, RBC, clotting time, preventive tracheotomy, respirator use, operative time, anesthesia time, blood loss, and craniotomy between the two groups (P < 0.05), as shown in Table 1.
The occurrence of PPC was taken as the dependent variable, and statistically significant factors in univariate analysis were taken as independent variables. Logistic regression analysis was performed. Variables were screened by stepwise method (the model inclusion level was 0.05 and the exclusion level was 0.1). The results showed that the chi-square test of likelihood ratio suggested that the regression model had statistical significance (P < 0.05). Current smoker, lymphocyte count, clotting  time, and ASA classification were all independent influencing factors for the occurrence of pulmonary complications (Table 2, Figure 2).
In the correlation analysis, we could see that glucose (0.225705), operative time (0.257506), leukocyte (0.264244), anesthesia time (0.291870), preventive tracheotomy (0.342191), ASA (0.345156) was closely correlated with PPC (Figure 3). In the RF model, we observed the importance of features, and the top five are glucose, lymphocyte counterpoint, clotting time, anesthesia time, and Alb (Figure 4). The ROC curves of the five derived models are plotted in Figure 5. The model achieved the highest AUC of 0.653, followed by the LR model of 0.774194. SGD (0.712871) model showed a relatively poor result in the ROC curve. When we observed f1, RF also performs relatively well, especially the f1 value of 0.69 in the test set ( Table 3).

DISCUSSION
Postoperative pulmonary complications (PPC) are a welldescribed cause of post-surgical detrimental outcomes, including intensive care unit admission, prolonged admissions, perioperative mortality, and increased hospital expenditures in patients who underwent surgery. Moreover, the complication rate of neurosurgery is naturally high. Previous studies have shown that pulmonary complications occur between 1.3 and 22%, depending on the different types of neurosurgery (1-4, 11). Now there are some predictors of pulmonary complications, such as the "Assess Respiratory Risk in Surgical Patients in Catalonia" (ARISCAT) risk score, the "Surgical Lung Injury Prediction" (SLIP) model, and LAS VEGAS risk score which are two well-established prediction scores used for the identification of patients at risk of developing PPC or ARDS, respectively (12,13). But these indicators are inapplicable to neurosurgery in clinical practice.
In this study, the incidence of PPC reached 53.7%. Since all our patients were in emergent and severe conditions comparative, the incidence of PPC tended to be higher. Some studies have shown that the mortality rate of patients undergoing decompressive craniectomy is as high as 40.9%. In this study, traditional logistic reviews identified some independent risk factors by univariate and multivariate regression analyses. In particular, current smoker, lymphocyte count, clotting time, and ASA classification were independent risk factors for PPC. It was basically consistent with the results of previous studies (14)(15)(16)(17)(18)(19)(20). However, we noticed that some important risk factors reported in previous literature, such as patients' blood glucose level and operation time, had not reached the multivariate regression inclusion criteria in our study. We thought it might be due to insufficient sample size, or our review of pulmonary complications was relatively broad.
In terms of preventing PPC, some of these risk factors are controllable and some are not. Careful timing of surgery, smoking reduction, regulation of blood glucose, and preventive use of antibiotics could minimize complications. However, most of the thrombo-embolic events are often unpredictable and unpreventable, and low molecular weight heparin in the context of protocols for thromboprophylaxis could also be a beneficial attempt (21).
In recent years, machine learning has been used to predict the prognosis of various neurological diseases with remarkable results (22)(23)(24)(25)(26)(27). RF is an ensemble learner composed of the decision tree, which also highlights the importance of each indicator. In this study, RF cast light on the importance of blood glucose indicators. In contrast, the p-value of blood glucose is on the margin of 0.05, so that it could be missing in the analysis. Comprehensively, the LR model performed better overall, including in the training set and test set. In particular, the test set performance remained stable, far outperforming other machine learning methods. As a most popular machine learning algorithm, RF provides accurate results without exhaustive hyper-parameter tuning and can be applied to both regression and classification problems, when the number of potential explanatory variables is far more than the observed values. In addition, all the other five models showed moderate classification ability (AUC ranging from 0.6 to 0.71). The current study could be considered as a novel exploration of the modified machine learning approach for PPC. In particular, the machine learning model can find some potential risk factors, such as the blood glucose index in this paper that could not be found in previous studies due to different learning models, especially in the case of limited sample size.
Although machine learning models are powerful, they are often more complex, which makes them difficult to understand like a: "black box" (28). Therefore, the interpretation of machine learning results particularly depends on the experience of clinicians, especially for the prediction of complications, in order to identify high-risk patients and adapt treatment plans as early as possible, so as to reduce the incidence of complications and improve the prognosis of patients. According to different situations, we can adjust the recall rate appropriately to avoid missing high-risk patients, and the requirement for precision can be relaxed, because once missing patients with pulmonary complications, it may cause serious consequences. For example, in this study, RF model with the parameters C = 1, precision = 0.76, recall rate = 0.82 is a relatively good prediction model. This study has several limitations. First, the diagnosis of PPC relied on the attending physicians' evaluation in this retrospective study; therefore, the potential of either underestimation or overestimation of the actual incidence of PPC could not be avoided. Our inclusion indicators were relatively loose, and although some patients were diagnosed with pulmonary edema, they did not need special intervention. Second, the definition of PPC was based on radiology evidence rather than etiological results. Another limitation of the study is that the diagnosis of PPC was occasionally a clinical one and that there was no clear source of infection. At the same time, our study was a retrospective analysis, and the number of specimens was relatively low considering the large amount required in machine learning analysis. Fortunately, we have adopted a variety of machine learning models to analyze and process the data to minimize the omission of important indicators.
Finally, in this study, the prediction models of pulmonary complications in patients with severe emergency ICH were established. Compared with traditional statistical methods, the

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding authors.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethics Committee of the First Affiliated Hospital of University of Science and Technology of China (Hefei, China). Written informed consent from the patients/participants or patients/participants legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.

FUNDING
The present study was supported by the Fundamental Research Funds for the Central Universities (Grant no. WK9110000126).