A New Scoring System for Predicting In-hospital Death in Patients Having Liver Cirrhosis With Esophageal Varices

Introduction: Liver cirrhosis is caused by the development of various acute and chronic liver diseases. Esophageal varices is a common and serious complication of liver cirrhosis during decompensation. Despite the development of various treatments, the prognosis for liver cirrhosis with esophageal varices (LCEV) remains poor. We aimed to establish and validate a nomogram for predicting in-hospital death in LCEV patients. Methods: Data on LCEV patients were extracted from the Medical Information Mart for Intensive Care III and IV (MIMIC-III and MIMIC-IV) database. The patients from MIMIC-III were randomly divided into training and validation cohorts. Training cohort was used for establishing the model, validation and MIMIC-IV cohorts were used for validation. The independent prognostic factors for LCEV patients were determined using the least absolute shrinkage and selection operator (LASSO) method and forward stepwise logistic regression. We then constructed a nomogram to predict the in-hospital death of LCEV patients. Multiple indicators were used to validate the nomogram, including the area under the receiver operating characteristic curve (AUC), calibration curve, Hosmer-Lemeshow test, integrated discrimination improvement (IDI), net reclassification index (NRI), and decision curve analysis (DCA). Results: Nine independent prognostic factors were identified by using LASSO and stepwise regressions: age, Elixhauser score, anion gap, sodium, albumin, bilirubin, international normalized ratio, vasopressor use, and bleeding. The nomogram was then constructed and validated. The AUC value of the nomogram was 0.867 (95% CI = 0.832–0.904) in the training cohort, 0.846 (95% CI = 0.790–0.896) in the validation cohort and 0.840 (95% CI = 0.807–0.872) in the MIMIC-IV cohort. High AUC values indicated the good discriminative ability of the nomogram, while the calibration curves and the Hosmer-Lemeshow test results demonstrated that the nomogram was well-calibrated. Improvements in NRI and IDI values suggested that our nomogram was superior to MELD-Na, CAGIB, and OASIS scoring system. DCA curves indicated that the nomogram had good value in clinical applications. Conclusion: We have established the first prognostic nomogram for predicting the in-hospital death of LCEV patients. The nomogram is easy to use, performs well, and can be used to guide clinical practice, but further external prospective validation is still required.


INTRODUCTION
Liver cirrhosis is a chronic liver disease characterized by pseudolobule formation, hepatocyte necrosis, regenerated nodules, and diffused fibrosis. It is caused by advanced liver disease with a complex clinical pathogenesis. Most scholars believe that it is related to liver damage caused by bile acid deposition, immune factors, alcohol, viruses, and other longterm ongoing effects (1). Portal hypertension and liver function injury are the main manifestations of advanced liver cirrhosis, while esophageal varices is one of the most serious complications of portal hypertension in liver cirrhosis. Reportedly 30-70% of liver cirrhosis patients develop esophageal varices, and 5-15% will experience rupture bleeding, with mortality occurring in up to 30% of cases of the first hemorrhage (2,3). Patients having liver cirrhosis with esophageal varices (LCEV) are also prone to acute chronic liver failure, hepatorenal syndrome, ascites (AC), hepatic encephalopathy (HE), and other complications (4,5).
The improvements in quality of life and an increasingly aging society are increasing the incidence of LCEV, and it is therefore urgent for effective clinical treatments to be identified. Current first-line treatments include vasoactive drugs, prophylactic antibiotics, and endoscopic vein ligation. Despite improvements in diagnosis and treatment, mortality rates in LCEV patients remain high, with rates of 13.4-22.7% at 6 weeks (6)(7)(8)(9). It is therefore critical to develop a severity scoring system stratified by mortality risk to accurately and rapidly assess the prognosis and guide treatments in individual LCEV patients Many existing scoring systems have been used to evaluate the prognosis of LCEV patients, but none of them are targeted. These scoring systems can be divided into two types. One type focuses on assessing the prognosis of patients with liver cirrhosis, including the Model for End-Stage Liver Disease (MELD), Child-Pugh score, and MELD-Na, which add serum sodium to the MELD system (10,11). The other type evaluates acute upper gastrointestinal bleeding, including the Glasgow Blatchford, Rockall, and AIMS65 (12). Bai et al. recently proposed the cirrhosis acute gastrointestinal bleeding (CAGIB) system, which includes diabetes (DB), hepatocellular carcinoma (HCC), bilirubin, albumin, alanine aminotransferase (ALT), and creatinine (13). However, the prognostic value for LCEV patients of these scoring systems is very limited (2).
The main objective of this study was therefore to identify the significant prognostic factors for LCEV patients from a large database, and to establish and validate an easy-to-use prognostic nomogram that predicts their in-hospital death. The nomogram will help clinicians to stratify the risk of LCEV patients and develop treatment strategies, and also help the families of patients to understand their condition.  (17,18). The relevant records include demographic data, hourly vital signs, laboratory test results, microbial culture results, imaging data, treatment procedures, medication records, and survival information.
The use of the MIMIC-III and MIMIC-IV databases was approved by the Institutional Review Board of the Beth Israel Deaconess Medical Center and Massachusetts Institute of Technology, and all patient information in the database is anonymous, so informed consent was not required (19,20).
We completed the online course and examination to gain access to the database (Record ID: 38455175).

Patients and Variables
We used SQL (Structured Query Language) programming in Navicat Premium (version 11.2.7.0) to extract data. ICD-9 (ninth edition of the International Classification of Diseases) codes were used to identify LCEV patients: codes 5712, 5715, and 5716 for liver cirrhosis; and codes 4560, 4561, 45620, and 45621 for esophageal varices. The exclusion criteria were aged <18 or >89 years, or dying within 24 h of admission to an intensive care unit (ICU). Patient data for the first admission only were used for those who had been admitted multiple times to the ICU.
After identifying eligible subjects, we used their hadm_id and icustay_id parameters to extract information from the corresponding tables, including age, gender, marital status, ethnicity, insurance, comorbidities, 24-h urine output, vital signs, laboratory parameters, renal replacement treatment (RRT)use, mechanical ventilation (Mechvent) use, vasopressor use, severity scoring system, and survival information. Comorbidities included HE, AC, HCC, DB, and the Elixhauser score. The vital signs used were the mean values during the first 24 h of the ICU stay, including heart rate, mean blood pressure (MBP), respiratory rate, temperature, and percutaneous oxygen saturation (SpO 2 ). The laboratory parameters analyzed were those that were first obtained after the ICU admission. The study indexes were ALT, aspartate aminotransferase (AST), albumin, bilirubin, alkaline phosphtase (AP), anion gap (AG), bicarbonate, phosphate, chloride, calcium, magnesium, potassium, sodium, glucose, lactate dehydrogenase (LD), creatinine, blood urea nitrogen (BUN), hematocrit, hemoglobin, mean corpuscular hemoglobin (MCH), mean corpuscular volume (MCV), red blood cell distribution width (RDW), red blood cells (RBC), white blood cells (WBC), platelet, international normalized ratio (INR), prothrombin time (PT), and partial prothrombin time (PTT). Severity scoring systems included the Glasgow Coma Scale (GCS) and Oxford Acute Severity of Illness Score (OASIS).
Marital status was classified into married, unmarried, and other (divorced, separated, or widowed). Race categories were white, black, and other. We also classified liver cirrhosis into two categories of etiology (cholestasis or alcoholic, and other) and classified esophageal varices into bleeding and notbleeding categories.
The MELD-Na and CAGIB scores were calculated using relevant data in the following formula: MELD-Na = 3. The endpoint for our study was in-hospital death. Patients who were still alive at discharge were designated as alive.

Statistical Analysis
Missing data are common in the MIMIC database, and this study used multiple imputation to account for missing data. And in order to avoid excessive bias, the missing proportion of variables studied in this research was <20%. Multiple imputation technique involves creating multiple copies of the data and replacing the missing values by selecting a suitable random sample from the predicted distribution (14). We used the mice package of R software to obtain 10 estimated data sets. Predictive mean matching and logistic regression methods were used for continuous and categorical variables, respectively. The specific missing proportion of variables before imputation is shown in Supplementary Figure 1.
We randomly assigned 70% of patients in MIMIC-III database to the training cohort and 30 % to the validation cohort. The training cohort was used to establish the nomogram, while the validation cohort and MIMIC-IV cohort were used to perform validation. Frequency and percentage was used to describe the categorical variables, and the chi-square test or Fisher's exact test was used to identify differences between groups. The Shapiro-Wilk test was applied to continuous variables to confirm that they conformed to a normal distribution. Those that did were described using mean and standard-deviation values, and a Student's t-test was used to identify differences between groups. The other continuous variables were described using median and interquartile-range (IQR), and the Mann-Whitney U-test was used to identify differences between groups.
Logistic regression was used to identify risk factors that were independently associated with the in-hospital death of LCEV patients (OASIS, MELD-Na, and CAGIB systems were not included in the analysis). Because of the large number of variables in our study, we used two steps to screen for independent prognostic factors. We first used the least absolute shrinkage and selection operator (LASSO) method for conducting preliminary screening to solve the collinearity effect. The LASSO method reduces the coefficient of irrelevant variables to zero, while retaining important variables (22). The largest value of lambda was chosen when the cross-validation error was within one standard error of the minimum. The variables selected by LASSO were then further screened using the forward LN stepwise regression method. The probability threshold was 0.05 for entry and 0.10 for removal. All identified independent prognostic factors were used to establish a logistic regression model and the results were presented as odds ratios (ORs) and 95% confidence intervals (CIs). Collinearity between continuous variables was tested by the variance inflation factor (VIF), and an arithmetic square root of VIF ≤ 2 was considered as non-collinearity (23). Finally, we established a nomogram that included all independent prognostic factors that predict in-hospital death in LCEV patients. We also constructed a dynamic nomogram using the DynNom package of R software to facilitate the application of the new model.
The nomograms were validated using multiple indicators. The area under the receiver operating characteristic curve (AUC) assessed the discriminative ability of the nomogram, which was compared with the AUC values of the OASIS, MELD-Na, and CAGIB systems. The receiver operating characteristic curve was used to determine the optimal cutoff value and its corresponding sensitivity and specificity according to Youden's index. The integrated discrimination improvement (IDI) and the net reclassification index (NRI) were also used to calculate how the performance of the nomogram improves on the other scoring systems. We further plotted calibration curves and performed Hosmer-Lemeshow test to evaluate the calibration of the nomogram. Decision curve analysis (DCA) was used to evaluate the net benefits of medical interventions under the guidance of the nomogram and the OASIS, MELD-Na, and CAGIB systems. We also performed a subgroup analysis to evaluate the application of the nomogram in the bleeding and nonbleeding cohorts via AUC, P < 0.05 were considered statistically significant. R software (version 4.0.3) and SPSS software (version 24.0) were used for all analyses. The R packages used included glmnet, lattice, MASS, nnet, mice, rms, foreign, regplot, pROC, nricens, PredictABEL, DynNom, survival, and reconnect.

Baseline Characteristics
After applying the inclusion and exclusion criteria, 813 LCEV patients were identified from MIMIC-III database (569 and 244 in the training and validation cohorts, respectively) and 930 LCEV patients were identified from MIMIC-IV database. Among the causes of liver cirrhosis, the rates of alcohol or cholestasis were 54.7 and 54.9%, respectively, in the training and validation cohorts. Bleeding from esophageal varices (41.7 and 44.3% in the training and validation cohorts) was slightly less common than not bleeding. There were fewer patients with HE (22.1 and 23.0% in the training and validation cohorts, respectively), AC (29.9 and 32.4%), HCC (12.0 and 14.3%), and

Nomogram Construction
To construct the nomogram, the variables were first preliminarily screened using LASSO. Figure 1 shows the different meansquared error within the range of log(lambda). When the cross-validation error was less than the standard error of the minimum value, the maximum lambda value was selected.  We established a nomogram based on the above results that included all of the identified independent prognostic factors to predict in-hospital death in LCEV patients (Figure 2). The nomogram indicates that INR has the greatest influence on the prognosis of LCEV, followed by albumin, bilirubin, AG, sodium, Elixhauser score, vasopressor use, age, and bleeding. We also established a dynamic nomogram (https://xufengshuo. shinyapps.io/LCEV/) to facilitate the application of the model.

Nomogram Validation
We compared the predictive performances for in-hospital death from LCEV of our nomogram and the MELD-Na, CAGIB, and OASIS systems; the results are listed in Table 3. The AUC value of the nomogram was 0.867 (95% CI = 0.832-0.904) in the training cohort, 0.846 (95% CI = 0.790-0.896) in the validation cohort and 0.840 (95% CI = 0.807-0.872) in the MIMIC-IV cohort, which were significantly higher than those for the other scoring systems. The ROC curves are shown in Figure 3. In the training cohort, the optimal cutoff point was 0.250, for which the sensitivity and specificity were 0.884 and 0.731, respectively. In the validation cohort, the optimal cutoff point was 0.145, for which the sensitivity and specificity were 0.745 and 0.813, respectively. And in MIMIC-IV cohort, the optimal cutoff point was 0.139, for which the sensitivity and specificity were 0. . These values suggest that our nomogram has better discrimination ability and is superior to these commonly used scoring systems. Figure 4 shows the calibration curves for the nomogram. The calibration curves of the training and validation cohorts were close to the leading diagonal. And the results of Hosmer-Lemeshow test were not statistically significant (chi-square = 7.403 and P = 0.595 for the training cohort, chi-square = 7.630 and P = 0.572 for the validation cohort, chi-square = 6.497 and P = 0.689 for the MIMIC-IV cohort). All of these indicated that our nomogram provided a good fit to the available data. Finally, we plotted DCA curves to illustrate the clinical value of the nomogram and compared it with those of OASIS, MELD-Na, and CAGIB systems (Figure 5). When the threshold probability was between 0.1 and 0.7 (in either cohort), clinical interventions guided by the nomogram had greater net benefits than the other scoring systems.
Further, we generated ROC curves for each continuous variable among independent prognostic factors, as shown in Supplementary Figure 2. The AUC of all variables were higher than 0.5, indicating that their inclusion in the predictive model was reliable. In addition, we also performed subgroup analysis, as shown in Supplementary Figure 3. The AUC of the nomogram for the non-bleeding cohort and the bleeding cohort were 0.866 (95% CI = 0.835-0.894) and 0.847 (95% CI = 0.818-0.873), both higher than the other scoring systems. The results showed that in different subgroups, the nomogram has good predictive performance.

DISCUSSION
Liver cirrhosis results from the development of various acute and chronic liver diseases. Liver cirrhosis from any cause can lead to either obstruction of or increased blood flow in the portal vein, leading to portal hypertension, or lateral circulation open. The main cause of esophageal varices is portal hypertension. LCEV is a common critical complication of decompensated cirrhosis. The prognosis of LCEV prognosis remains poor despite the development of various treatment methods. It is therefore very important to develop a convenient and effective prognostic model that stratifies the risk of LCEV patients in order to guide treatments (9).
The MIMIC-III and MIMIC-IV databases contain a large number of clinical diagnoses and treatment data for critically ill patients, thereby providing effective samples for clinicians to conduct scientific research. This study used the MIMIC-III and MIMIC-IV databases to extensively explore independent predictors of in-hospital death in LCEV patients, which include age, vasopressor use, Elixhauser score, albumin, AG, bilirubin, sodium, INR, and bleeding. We applied these factors to a logistic regression model and generated a nomogram to display it. In addition, we created a Web-based dynamic nomogram to facilitate its clinical application. To the best of our knowledge, this is the first nomogram to be applied to LCEV patients. Notably, the vital signs used in this study were the mean values from the first 24 h of ICU admission, and laboratory test results used were the first obtained after an ICU admission. The nomogram was therefore not applicable to patients who died or were discharged within 24 h of ICU admission.
This study found that age, vasopressor use, Elixhauser score, albumin, AG, bilirubin, sodium, INR, and bleeding were important prognostic factors for LCEV, which is consistent with the findings of other studies. These factors are also commonly used indicators in many severity scoring systems for cirrhosis, such as MELD-Na and CAGIB.
Age has been proven to be the main factor for the poor prognosis of various diseases (24). The reason is that with age, the body's immunity will inevitably decrease (25,26). Moreover, the function of the organs will decline. For example, elderly patients have reduced gastrointestinal digestive function, limited ability to absorb nutrients, and are extremely prone to malnutrition, which will adversely affect the prognosis of patients. In addition, elderly FIGURE 2 | Nomogram for predicting in-hospital death in LCEV patients. LCEV, liver cirrhosis with esophageal; GCS, Glasgow Coma Scale; BUN, blood urea nitrogen; AP, alkaline phosphtaase; INR, international normalized ratio; SpO 2 , percutaneous oxygen saturation; HCC, hepatocellular carcinoma. *P < 0.05, **P < 0.01, ***P < 0.001.
patients have more comorbidities than younger patients, so the situation will be more serious. The impact of Elixhauser score on the prognosis also illustrates this point. It is a comorbidity scoring system based on the number and severity of the disease that a patient suffers from and quantifies their comorbidities. As the number of comorbidities increases, the patient's prognosis becomes worse (27).
It can be seen from the nomogram that the INR occupies a greater weight, and as the INR increases, the patient's prognosis becomes worse. INR is an indicator of blood coagulation function. The reason for its prolongation is that the patient enters the decompensated phase of liver cirrhosis, liver function continues to deteriorate, prothrombin synthesis is impaired, which leads to prolonged PT. At the same time, the activation of mononuclear phagocytes caused by spleen enlargement increases platelet destruction, which will reduce blood coagulation function (1,28). Therefore, patients with liver cirrhosis often have nasal cavity, gum bleeding, skin and mucous membrane    petechiae and gastrointestinal bleeding, etc., which are also related to the above-mentioned mechanisms such as reduction of hepatic coagulation factors and hypersplenism, reflecting that the patient is in decompensation, leading to poor prognosis (29).In addition, the use of vasopressor is also one of the factors of poor prognosis for patients, which means that the patient has already experienced a drop-in blood pressure, and drugs are needed to improve vascular function and microcirculation blood perfusion. The reason may be that the blood volume is decreased due to heavy bleeding in the digestive system, or the patient has spontaneous peritonitis (30), or portal hypertension reduces the intestinal mucosal barrier function, and bacteremia caused by bacteria in the intestinal cavity entering the blood circulation. Under the action of inflammatory mediators, blood volume decreases, and blood vessel elasticity decreases (31). Albumin and bilirubin are also important indicators that reflect liver function. Studies have shown that low serum albumin is common in liver cirrhosis and is related to reduced survival rates (32). Changes in bilirubin levels often indicate liver dysfunction in patients with liver cirrhosis, which is closely related to a poor prognosis (33).
In recent years, many studies have found the clinical value of the anion gap (AG) in assessing the prognosis of the disease (34). For instance, in patients with acute myocardial infarction, compared with patients with normal AG, the hospitalization rate of patients with high AG increased, and the mortality rate within 1 week of admission increased (35). The most common disease with elevated AG is metabolic acidosis, which means the overproduction of organic acids, such as the accumulation of lactic acid, the production of toxins from keto acids, and metabolic acidosis caused by uremia. Patients with elevated AG are accompanied by severe electrolyte abnormalities, and this is related to the severity of the disease (36).Serum sodium is an indicator in the MELD-Na system. Most scholars believe that hyponatremia is associated with portal hypertension, and that integrating this indicator in the MELD system improves its prediction accuracy. Our study also similarly concluded that serum sodium is a protective factor in the prognosis of LCEV patients. On the one hand, there are sodium in the calculation formula of AG, and the increase in serum sodium also indirectly reflects the increase in AG. Another aspect, sodium can also be an indicator of cirrhosis progression. The causes of hyponatremia in cirrhosis include obvious liver damage, Na+-K+-ATP dysfunction, and reduced cellular release of Na+; aldosterone, antidiuretic hormone, atrial natriuretic peptide, and other hormones not being metabolized by the liver, resulting in water retention and dilution causing low sodium levels; and the rapid release of large amounts of AC, excessive diuresis, vomiting, diarrhea, and long-term low-salt diets, causing sodium loss (37).
A nomogram is commonly used method for presenting a model that combines important prognostic factors and specific endpoints to quantitatively assess the prognostic risk of individual patients. Our nomogram contains a small number of effective and readily available prognostic factors for LCEV patients, making it easy to use. As shown in Figure 2, a score was assigned to each characteristic of a patient, and the scores are then summed to obtain an overall score, which corresponds to the in-hospital death risk. We also generated a more user-friendly dynamic nomogram. In order to confirm the validity of our nomogram, we used multiple indicators in the training, validation and MIMIC-IV cohorts to compare its performance with MELD-Na, CAGIB, and OASIS systems in predicting the prognosis of LCEV patients. As is evident from the section Results, our nomogram is superior to these other scoring systems in terms of differentiation, calibration, and clinical application.
There are inevitable limitations to our study. First, because the MIMIC is a single-center database, our study had selection bias and restricted generalizability. Although the new model based on MIMIC-III has achieved good validation results in MIMIC-IV, it still needs to be validated in datasets other than MIMIC. Second, many potential prognostic factors were not included in our model, which reduced the accuracy of the nomogram predictions. A nomogram obviously does not provide completely accurate prognosis predictions, and so should only be used as a reference by clinicians. Third, our study was based on a retrospective cohort, and so the nomogram needs further prospective validation before being considered for clinical application.

CONCLUSION
We established the first prognostic nomogram for predicting the in-hospital death of LCEV patients based on the MIMIC database. The nomogram is easy to use, performs well, and can be used to guide clinical practice; however, further external prospective validation is needed.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://physionet.org/ content/mimiciii-demo/1.4/ and https://physionet.org/content/ mimiciv/1.0/.

ETHICS STATEMENT
The use of the MIMIC database was approved by the Institutional Review Board of the Beth Israel Deaconess Medical Center and Massachusetts Institute of Technology, and all patient information in the database is anonymous, so informed consent was not required (29,30). We completed the online course and examination to gain access to the database (Record ID: 38455175).

AUTHOR CONTRIBUTIONS
FX and LZ analyzed the data and wrote the paper. ZW and DH collected the data. CL and SZ checked the integrity of the data and the accuracy of the data analysis. FX, HY, and JL designed the study and revised the paper. All authors read and approved the final manuscript.