Machine Learning Models for Predicting In-Hospital Mortality in Acute Aortic Dissection Patients

Background: Acute aortic dissection is a potentially fatal cardiovascular disorder associated with high mortality. However, current predictive models show a limited ability to efficiently and flexibly detect this mortality risk, and have been unable to discover a relationship between the mortality rate and certain variables. Thus, this study takes an artificial intelligence approach, whereby clinical data-driven machine learning was utilized to predict the in-hospital mortality of acute aortic dissection. Methods: Patients diagnosed with acute aortic dissection between January 2015 to December 2018 were voluntarily enrolled from the Second Xiangya Hospital of Central South University in the study. The diagnosis was defined by magnetic resonance angiography or computed tomography angiography, with an onset time of the symptoms being within 14 days. The analytical variables included demographic characteristics, physical examination, symptoms, clinical condition, laboratory results, and treatment strategies. The machine learning algorithms included logistic regression, decision tree, K nearest neighbor, Gaussian naive bayes, and extreme gradient boost (XGBoost). Evaluation of the predictive performance of the models was mainly achieved using the area under the receiver operating characteristic curve. SHapley Additive exPlanation was also implemented to interpret the final prediction model. Results: A total of 1,344 acute aortic dissection patients were recruited, including 1,071 (79.7%) patients in the survivor group and 273 (20.3%) patients in non-survivor group. The extreme gradient boost model was found to be the most effective model with the greatest area under the receiver operating characteristic curve (0.927, 95% CI: 0.860–0.968). The three most significant aspects of the extreme gradient boost importance matrix plot were treatment, type of acute aortic dissection, and ischemia-modified albumin levels. In the SHapley Additive exPlanation summary plot, medical treatment, type A acute aortic dissection, and higher ischemia-modified albumin level were shown to increase the risk of hospital-based mortality.


INTRODUCTION
Acute aortic dissection (AAD) is amongst the most common cardiovascular disorders; moreover, it is renowned for its high mortality (1,2). Research has shown that ∼1-2% of all patients with AAD die every hour following the onset of symptoms (3). Despite considerable improvements to the treatment of AAD recently, the in-hospital mortality of AAD remains at a concerningly high level (4)-nearly 20% of all patients with AAD die before hospital admission (5). Therefore, being able to predict the in-hospital mortality risk of this disorder precisely and efficiently in its early phase would undoubtedly improve the prognosis of patients diagnosed with AAD in the future.
Previous studies were unable to comprehensively detect the risk of in-hospital fatality or short-term death in patients with AAD (6)(7)(8). Tolenaar et al. (6) evaluated the death risk of this disorder and were able to produce a convenient bedside prediction tool for patients with acute type B aortic dissection. Similarly, Leontyev et al. (7) developed a scorecard to anticipate the short-term mortality of patients with type A AAD, which has proved to be very useful. In addition, Yang et al. (8) managed to identify potential predictors of in-hospital mortality, and subsequently constructed a predictive nomogram prototype to detect high-risk patients with AAD. However, these studies were analyzed using a conventional logistic regression (LR) method, which involves statistical assumptions about the independent linear relationship between the variables and the outcomes, or neglected the analysis of other valuable variables. Consequently, these findings are somewhat limited due to the complex process, inadequate predictive strength, and poor stability.
Machine learning (ML) is a specific form of artificial intelligence (AI) that automatically obtains valuable information and can recognize underlying patterns within large sets of data, subsequently generating an outcome prediction (9). Compared with traditional prediction methods, ML techniques perform on a superior level, and therefore, have been applied to an array of medical services, such as image identification, diagnosis, and treatment (10,11). Existed research has evidenced the capability of ML algorithms to improve patient outcomes in relation to sepsis, based on the development of diagnosis and risk prediction models (12)(13)(14). Martinez et al. (15) also developed a ML model to identify high-risk patients for acute kidney injury at an early stage. Another recent study focused on patients presenting with chest pain in the emergency department, whereby it was found that ML had a critical role as a decision support tool for early detection of myocardial infarction (16). Therefore, the implementation of ML can be seen as a major contributor to improving patients' quality of life.
In recent times, ML has emerged in the context of aortic dissection. Huo et al. (17) successfully demonstrated the use of ML models to identify patients with AAD from misdiagnosed cases. This is a beneficial finding as it would aid early classification of the disorder and would enable timely decisionmaking by physicians. However, there is little research regarding the use of the ML algorithm to predict short-term outcomes of patients with AAD.
The purpose of this study was to construct and evaluate a ML model with the goal of predicting in-hospital mortality in patients with AAD. The significance of the results could aid efficient detection of high-risk patients and could effectively allocate appropriate medical resources upon AAD diagnosis.

Study Design and Setting
A retrospective single-center study was designed, whereby the clinical information of patients with AAD admitted to the Second Xiangya Hospital of Central South University were investigated. The patients were admitted to the hospital between January 2015 to December 2018. Prior to study commencement, ethical approval was granted by the institutional review board; as this was a retrospective observational study, the requirement for informed consent was removed.
Enrollment in the study involved 1,344 adult patients with AAD. Classification relied upon Stanford criteria, whilst magnetic resonance angiography (MRA) or computed tomography angiography (CTA) was used to diagnose AAD, based on the 2014 European Society of Cardiology (ESC) guidelines regarding the medical treatment and diagnosis of AAD (18). The exclusion criteria of the study included: (1) being under the age of 18; (2) the presence of intramural hematoma; (3) pregnancy; (4) hospital admission being ≥14 days since the commencement of symptoms. bilirubin, creatinine, blood urea nitrogen, uric acid, myoglobin, creatine kinase, creatine kinase-MB, troponin T, B-type natriuretic peptide, D-dimer, ischemia-modified albumin, Creactive protein, erythrocyte sedimentation rate, procalcitonin, and lactate dehydrogenase. All variables were detected within the initial 24 h of patient admission; the main laboratory of the Second Xiangya Hospital was responsible for analysis of each variable.

Clinical Outcome
In-hospital mortality was regarded as the clinical outcome, which referred to all causes of death during a period of hospitalization.

Data Preprocessing
For the further development of ML models, categorical features were preprocessed according to their natures. For instance, the treatment of patients was encoded as 0, 1, or 2 (0 = Medical treatment, 1 = Endovascular treatment, 2 = Surgical treatment). Moreover, the type of AAD and gender were encoded as 1 or 2 (type of AAD: 1 = type A, 2 = type B; gender: 1 = male, 2 = female). In other cases, the features related to clinical conditions such as smoking, drinking, medical history, and symptoms are binary, were encoded as 0 or 1 (0 = absence, 1 = presence).

Model Construction
Once the features were inputted, ML algorithms were applied, including LR, decision tree (DT), Gaussian naive bayes (GaussianNB), K nearest neighbor (KNN), and extreme gradient boost (XGBoost). These algorithms enabled predictions regarding in-hospital mortality in a sample of patients with AAD. Python programming software (version 3.6) was used to build the predictive models. Figure 1 displays the concise training flow chart; firstly, all the data was randomly split into training and test sets based on 9:1 division. Optimal model parameters were then modified within the training set. To avoid overfitting, the model was tested on an independent test set that was unseen during training.

Model Training and Performance Evaluation
The model performance of the test set was evaluated by creating receiver operating characteristic (ROC) curves and, respectively, calculating the area under the ROC (AUROC) for all the models. At various thresholds, creation of the ROC utilized the "true positive rate" (TPR) against the "false positive rate." Model capability was evaluated using the AUROC. To conduct a comprehensive assessment of model performance, the sensitivity (TPR), accuracy, average precision, specificity (true negative rate, TNR), positive predicted value (PPV), and negative predictive value (NPV) were all acknowledged.
Finally, 10-fold cross validation was implemented in the aforementioned procedures; this reduced the variability in estimations of model performance and ensured that the estimated performance of a model would reflect its practical performance. The model that obtained the best average performance metrics of the 10-fold-validation was then regarded as the optimal ML predictive model of in-hospital mortality in patients with AAD.

Model Interpretation
There are distinctive Black-Box characteristics associated with ML, which can weaken the model's ability to produce correct interpretations (19). Nonetheless, the reasons behind each predictive outcome should be ascertained. To achieve this, in the ML model, the importance of each feature was evaluated by the feature importance score, which was determined based on the average reduction of loss when a feature was used as a partition attribute (20). The higher the feature score, the greater the influence of the feature on the prediction.
The SHapley Additive exPlanation (SHAP) method was used to expand and enhance the interpretation of the XGBoost model; the SHAP method provides a visualization of the prediction created by the final model. Concurrently, the DT model was also interpreted through SHAP Tree-Explainer as a way of comparing the results of SHAP. Cooperative game theory was initially used to establish the SHAP method; the theory was further developed to facilitate the SHAP method's ability to calculate the individual contribution values of each feature toward the final prediction (21). The SHAP method also evidenced the positive or negative influence of each feature value on the predicted results.
To explain the single prediction of the ML model, Local Interpretable Model-agnostic Explanations (LIME), a commonly used local explanation tool, was included in the model interpretation (22). LIME utilizes interpretability models, including linear models and tree-based models, to locally infer the target black box model's prediction. Although this does not result in significant depth within the model, this method is able to detect changes in the output of the black box model based on slight perturbations of the input. The model can then train an interpretability model at specific points of interest (the original input) based on this change. An important point to acknowledge is that the interpretability model is a local approximation of the black box model, as opposed to a global approximation, which also explains the origin of its name.

Statistical Analysis
Two patient groups were constructed according to their status as deceased or alive during the hospitalization period. Variables were compared between these groups, whilst characteristics for continuous variables were shown as mean ± standard deviation or as median (IQR), and for categorical variables were shown as a percentage or frequency. Student's t-test (normal distribution) or Mann-Whitney U-test (skewed distribution) were used to compare continuous variables; in contrast, Fisher's exact test or Chi-square analysis were implemented to compare categorical variables. Regarding missing data, any feature with >10% of the data missing was eliminated; however, for features with <10% of the data missing, imputed values (which were combined using Rubin's rules) were used to impute the missing data. Statistical analyses were all completed using R software. Statistical significance was deemed when two-sided P < 0.05.

Demographic Information and Clinical Characteristics
This study involved the recruitment of 1,344 patients with AAD between January 2015 to December 2018 (Figure 2

Comparison of Baseline Characteristics Between Survivor and Non-survivor Groups
An insignificant difference was determined between age, gender, height, weight, and body mass index between the survivor and non-survivor groups (P > 0.05). However, the non-survivor group presented lower systolic (P < 0.001) and diastolic (P <  0.001) blood pressure at admission, compared with the survivor group. Furthermore, the non-survivor group showed statistically significantly higher levels of creatinine (P < 0.001), uric acid (P < 0.001), myoglobin (P = 0.002), creatine kinase (P < 0.001), troponin T (P < 0.001), B-type natriuretic peptide (P < 0.001), Ddimer (P < 0.001), and ischemia-modified albumin (P < 0.001), compared with the survivor group. Meanwhile, the platelet count (P < 0.001) and c-reactive protein (P < 0.001) values were lower in the non-survivor group compared with the survivor group. In addition, greater frequencies of Stanford type A AAD (P < 0.001) and Marfan syndrome (P = 0.049) were detected in the non-survivor group compared with the survivor group. It was also determined that the non-survivor group was more likely to receive medical treatment (P < 0.001) ( Table 1).

Model Interpretation
In terms of the XGBoost model, Figure 4 presents the feature importance matrix plot, whereby the importance of the features is determined in terms of creating the final predictive model.  shows that the three most important features were treatment, type of AAD, and ischemia-modified albumin levels.
From this, Figure 5 shows the average absolute SHAP values pertaining to the 20 most important features in the XGBoost model ( Figure 5A) and the DT model ( Figure 5B). As with the ranking of the feature importance scores, in both models, the top three features of the mean absolute SHAP values remained as treatment, type of AAD, and ischemia-modified albumin levels. The SHAP summary plots of the XGBoost model ( Figure 6A) and the DT model ( Figure 6B) explain the relationship between the feature type or level and the SHAP values. By comparing the SHAP results of XGBoost and DT model, an association was determined between positive SHAP values and medical treatment, type A AAD, and higher ischemia-modified albumin levels; resultantly, this denotes an increased risk of in-hospital mortality. Contrastingly, negative SHAP values were associated with endovascular or surgical treatment, type B AAD, and lower   ischemia-modified albumin levels, which implies decreased risk of in-hospital mortality. LIME was employed to explore the feature contributions of the predictions. The test dataset comprised two patients, whereby correct predictions had been formulated by the XGBoost model. Figure 7A, shows the correct prediction of in-hospital mortality pertaining to patient 1 from the "True Positive" group; this prediction was formed based on the patient receiving medical treatment, and having TA-AAD without Marfan syndrome association. In the "True Negative" group, patient 2 was also correctly predicted as survival ( Figure 7B). The data for patient 2 showed low ischemia-modified albumin level (<=72.50 µ/ml), surgical treatment, and no indication of diabetes, all of which aided the negative prediction.
It is important to understand the reason behind incorrect interpretations. Therefore, patient 3 was included with a "False Positive" prediction-this patient was an in-hospital mortality patient who was incorrectly predicted to have a high likelihood of survival. Diagnosis of TA-AAD, the occurrence of stroke, and low systolic blood pressure (<=127.0 mmHg) were deemed to be the most influential features leading to the XGBoost model's prediction error (Figure 8A). In addition, patient 4 was included with a "False Negative" prediction, who was a survival patient but incorrectly predicted to have a high probability of in-hospital mortality (Figure 8B). It was found that low ischemia-modified albumin level (<=72.50 µ/ml), surgical treatment, and the presence of Marfan syndrome chiefly led to the prediction error in the XGBoost model.
In general, the interpretation results of SHAP and LIME are consistent. They show that the three most important characteristics that impact on the in-hospital mortality risk of patients with AAD are treatment strategy, type of AAD, and ischemia-modified albumin levels.

DISCUSSION
To date, this study is the first of its kind to apply ML to predict in-hospital mortality of patients with AAD. The study comprised 41 relevant features, whereby five ML models were successfully trained and developed to predict the in-hospital mortality risk of a cohort of patients. Out of the five models, the XGBoost model exhibited the best performance and the greatest AUROC for single-model prediction. Moreover, the predictions generated by the XGBoost model were deemed to be more reliable and accurate than conventional LR.  DT model (B). The higher the SHAP value of a feature, the higher the risk of in-hospital mortality. A dot is shown for each feature attribution value for the model of each patient, and thus, one patient is allocated a single dot for each feature. Dots are colored based on the values of the features for the respective patient and accumulate vertically to depict density. Treatment was divided into three categorical features: medical treatment (blue), endovascular treatment (purple), and surgical treatment (red). Type of AAD (Stanford) was divided into two categorical features: Type A (blue) and Type B (red). For continuous features, red represents higher feature values, whilst blue represents lower feature values. Furthermore, the treatment strategy, type of AAD, and ischemia-modified albumin levels were identified as the most important variables linked to the prognosis of patients with AAD.
Recently, evidence has identified several factors that could be used to determine the risk of poor outcomes of AAD; however, their use as predictive factors of inhospital mortality remains controversial (6)(7)(8). In addition, traditional LR presents a relatively weak indicator of predictive performance, such as the use of AUROCs, or can produce a higher probability of error compared with ML (23,24).
As an adjusted dispersed gradient boosting library, XGBoost transforms the set of weak learners (25) to strong learners by implementing ML under the Gradient Boosting framework (26). In this present study, XGBoost produced outstanding prediction performance in the context of in-hospital mortality of a patient sample with AAD. A previous study found that the superior prediction performance of XGBoost can facilitate risk discrimination and early treatment of patients with acute kidney injury with mortality risk (27). Additionally, the XGBoost model has been shown to be capable of processing large datasets and can further analyze complex relationships between variables; in patients with sepsis, this model showed better prognosis prediction ability compared with traditional LR model (28). Although several ML models were tested to predict the mortality risk of patients with AAD in this study, the XGBoost model showed the most promising performance, which corroborates the findings of previous studies. Thus, clinicians and other relevant medical staff could make effective and individualized therapeutic strategies based on the predictive results of the XGBoost model, which would also facilitate more reasonable distribution of medical resources and would minimize the oftenexcessive medical costs faced by patients.
To the best of current knowledge, the primary treatment strategy for AAD is dependent on the type of AAD diagnosed for each patient. Approximately 1-2% of TA-AAD patients who do not receive any form of therapy will die every hour during the first 24 h of hospitalization, whilst almost 50% of patients will die within a week (4). Furthermore, the mortality rate can reach 20% within the first day due to severe complications, such as proximal or distal extension, valvular dysfunction, rupture, and pericardial tamponade; this can increase to 30% in the first 2 days (29). Over time, data from the International Registry of Acute Aortic Dissections (IRAD) has illustrated that the in-hospital mortality rate of patients with TA-AAD who undergo ascending-aorticrepair surgery has decreased significantly from 25 to 18%. However, patients who received medical treatment without surgery continued to face a high in-hospital mortality rate of 57% (30). For a long period, medical therapy alone was recommended as the main treatment strategy for uncomplicated TB-AAD, whilst thoracic endovascular aortic repair (TEVAR) was recommended for complicated TB-AAD (18). Yet, as clinical theory has continued to advance and medical techniques have become more innovative, the most appropriate therapy strategy for TB-AAD remains controversial, with endovascular techniques being used more for initially uncomplicated cases of TB-AAD. A randomized investigation found an association between endovascular stent-grafting technique and positive aortic remodeling, whereby a reduction of 5-years in the mortality risk was established compared with traditional medical treatment (31). Also, a considerable number of long-term followup studies have corroborated the finding that endovascular treatment is a better option for uncomplicated TB-AAD, based on the ability of early TEVAR to prevent the occurrence of fatal cardiovascular complications (31)(32)(33). Therefore, it is reasonable to suggest that patients with TA-AAD and patients with TB-AAD should receive immediate intervention upon diagnosis, including surgical repair or TEVAR, as this has been proven to improve patient prognosis drastically.
Under normal circumstances, the Stanford classification is used to determine the type of AAD: Type A indicates dissections involving the ascending aorta, as opposed to Type B which indicates dissections of the descending aorta (18). Based on previous findings, the anatomical categorization of AAD is the main indicator of a patient's prognosis (34). Under most conditions, TA-AAD is associated with wider involvement of dissection and an increasingly complicated pathologic change than TB-AAD. In a study conducted by Roselli et al. (35), it was found that over 40% of patients with AAD involving the proximal aorta died immediately, with an hourly mortality rate of 1-3% of patients. Findings from an IRAD investigation of patients with TA-AAD showed that the overall in-hospital mortality was 22%, while in patients with TB-AAD, the overall in-hospital mortality was 12-14% (30). In the present study, Stanford Type A dissections also showed a strong association with a higher risk of inhospital mortality compared with Stanford Type B dissections. Hence, patients diagnosed with TA-AAD were in more critical conditions and had a greater likelihood of dying during hospitalization, possibly due to the involvement of the ascending aorta.
Ischemia-modified albumin was derived from altered the N-terminus of albumin following exposure to ischemic tissues. This led to a decrease of metal binding ability (36). Currently, although ischemia-modified albumin has been mainly used for diagnostic and prognostic purposes relating to acute coronary syndrome, elevated levels have also been linked to a poor outcome of patients experiencing acute chest pain and severe sepsis (37,38). Moreover, a recent study depicted that raised ischemia-modified albumin levels corresponded with a high risk of in-hospital death in patients with AAD; this finding was consistent with the results of the current study (39). Elevated ischemia-modified albumin levels indicate ischemia-reperfusion and oxidative stress (40), the mechanism of which has been suggested by several studies. Firstly, the involvement of related arteries in the dissection of patients with AAD means that organ ischemia can raise ischemiamodified albumin levels. Secondly, unstable hemodynamical conditions caused by systemic tissue hypoxia could also be reflected in elevated ischemia-modified albumin. In general, these proposals imply that ischemia-modified albumin is a biomarker of an upstream process linked to the prognosis of patients with AAD, and therefore, elevated serum ischemiamodified albumin levels should be considered carefully and seriously.
Despite the many promising results of this study, the study design was somewhat inadequate. Not only was a small dataset used in this study, but it was also collected from a single source, thereby risking bias in the results. However, the source of the data, Second Xiangya Hospital, is an extensive medical center that provides care to the highest number of hospitalized patients with AAD in the Hunan Province. Therefore, the data retrieved from this site and used in this study is considered to be representative and reliable. A further restriction was that the features were manually recovered from electronic medical registers, which conveys a high likelihood of introducing observational error. Consequently, future studies should consider creating a real-time electronic record system that can record information rapidly and accurately. This study is also limited as the results cannot be applied to other nationalities or ethnicities, as the patients were all Chinese. Therefore, the generalized effectiveness of the model will need to undergo external testing before it can be deemed universally applicable. Nonetheless, the findings of this study will be useful to improve existing predictive models in future research; in particular, multi-center data should be included and external tests should be conducted rigorously regarding the predictions. Finally, to fully determine the black-box nature of the ML model, this study followed several previous studies (22,41,42) by using the SHAP method for global interpretation and LIME for local interpretation. Although the results of both types of interpretation of the XGBoost model were consistent and credible, improved robustness could be attained by using other interpretation methods, such as Shapley Lorenz, which is a novel global interpretation method that provides a global normalized measure of explainability. This would help to better explain our prediction model (43) and would improve the quality of future research.

CONCLUSIONS
Overall, an XGBoost model was successfully developed to predict in-hospital mortality in patients with AAD; this is a novel achievement. This model is clinically significant as it provides a reliable early-risk assessment tool for clinicians and other relevant health care professionals. The main outcome of the results is that selection of treatment strategies, the type of AAD, and ischemia-modified albumin levels are the most crucial factors to determine in-hospital mortality predictions of AAD.

DATA AVAILABILITY STATEMENT
The data statement should be revised as: The data presented in the study are deposited in the Figureshare repository, accession number 16437960 (https://doi.org/10.6084/m9. figshare.16437960.v1).

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethics Committee of the Second Xiangya Hospital, Central South University (Changsha, China) (NO. LYF2020044). Written informed consent was not required for this study, in accordance with the local legislation and institutional requirements.