CT-Based Radiomics Score Can Accurately Predict Esophageal Variceal Rebleeding in Cirrhotic Patients

Purpose: This study aimed to develop a radiomics score (Rad-score) extracted from liver and spleen CT images in cirrhotic patients to predict the probability of esophageal variceal rebleeding. Methods: In total, 173 cirrhotic patients were enrolled in this retrospective study. A total of 2,264 radiomics features of the liver and spleen were extracted from CT images. Least absolute shrinkage and selection operator (LASSO) Cox regression was used to select features and generate the Rad-score. Then, the Rad-score was evaluated by the concordance index (C-index), calibration curves, and decision curve analysis (DCA). Kaplan–Meier analysis was used to assess the risk stratification ability of the Rad-score. Results: Rad-scoreLiver, Rad-scoreSpleen, and Rad-scoreLiver−Spleen were independent risk factors for EV rebleeding. The Rad-scoreLiver−Spleen, which consisted of ten features, showed good discriminative performance, with C-indexes of 0.853 [95% confidence interval (CI), 0.776–0.904] and 0.822 (95% CI, 0.749–0.875) in the training and validation cohorts, respectively. The calibration curve showed that the predicted probability of rebleeding was very close to the actual probability. DCA verified the usefulness of the Rad-scoreLiver−Spleen in clinical practice. The Rad-scoreLiver−Spleen showed good performance in stratifying patients into high-, intermediate- and low-risk groups in both the training and validation cohorts. The C-index of the Rad-scoreLiver−Spleen in the hepatitis B virus (HBV) cohort was higher than that in the non-HBV cohort. Conclusion: The radiomics score extracted from liver and spleen CT images can predict the risk of esophageal variceal rebleeding and stratify cirrhotic patients accordingly.


INTRODUCTION
Esophageal variceal (EV) bleeding is one of the most serious complications in cirrhotic patients with portal hypertension (1). Although several recommended treatments are applied, patients who recover from the first episode of EV bleeding have a high risk of 1-year rebleeding (approximately 60%), with a mortality rate of up to 33% (2). EV rebleeding may lead to a series of complications, such as hepatic encephalopathy, spontaneous bacterial peritonitis, and liver failure, eventually making the patients lose opportunities for other remedial measures. Thus, prediction of rebleeding and the identification of patients at high risk of rebleeding after endoscopic therapy are urgent issues that could help improve the prognosis of cirrhotic patients (3).
In clinical practice, the most important predictor for variceal rebleeding is the size of the varices determined with endoscopy (4). However, the compliance of patients is affected by its expensive and invasive properties. Hepatic venous pressure gradient (HVPG) has been widely proven to be a strong predictive factor for EV bleeding in patients with cirrhosis (5), but it is available only in specialized hepatology units, which restricts its widespread use (6). Some researchers explored several non-invasive models, such as the portal vein diameter (7), Child-Pugh score (8) and model for end-stage liver disease (MELD) score (9), to predict esophageal variceal rebleeding in cirrhotic patients. However, the predictive performance of these noninvasive tool is still controversial.
Radiomics is an emerging field that extracts innumerable quantitative medical features from imaging into highdimensional data using many image characterization algorithms (10,11). It has great diagnostic and prognostic value in many fields of non-neoplastic liver lesions, such as liver fibrosis (12,13), hypertension and EV bleeding (14,15). Most radiomics-related studies on EV bleeding have focused mainly on the prediction of the severity of EV and the presence of EV bleeding. However, the prediction of esophageal variceal rebleeding based on radiomics has not yet been reported.
In this study, we constructed and validated a radiomics score (Rad-score) derived from radiomics features of the liver and spleen in cirrhotic patients to predict the risk of rebleeding. Moreover, the Rad-score was used to stratify patients into high-, intermediate-and low-risk groups.

MATERIALS AND METHODS
This retrospective study was approved by the institutional review broad, and the requirement for written informed consent was waived.

Patients
In this study, data from 173 patients diagnosed with cirrhosis between January 2011 and December 2019 were retrospectively analyzed. The patient inclusion criteria were as follows: (1) patients who had recovered from a first episode of EV bleeding and there was no bleeding for at least 5 consecutive days; (2) abdominal computed tomography (CT) scan and HVPG measurement were performed before endoscopic variceal ligation within 2 weeks; and (3) at least 1 year of follow-up after endoscopic therapies. The patient exclusion criteria were as follows: (1) previous therapy including splenectomy, endoscopic variceal ligation, tissue adhesive injection, or usage of nonselective beta blocker to prevent rebleeding; (2) confirmed to have hepatocellular carcinoma based on a histologic examination of the liver; (3) non-sinusoidal portal hypertension (e.g., hepatic cavernoma, Budd-Chiari syndrome); and (4) no contrastenhanced CT images and HVPG measurement. The recruitment process is shown in Supplementary Figure 1.

Definitions of Rebleeding and Therapy
The endpoint of the study was EV rebleeding during the 1-year follow-up. EV rebleeding is defined as the occurrence of new esophageal variceal bleeding after a period of 24 h or more from the 24-h point of stable vital signs and hematocrit/hemoglobin following the first episode of EV bleeding (4,16).
The first episode of EV bleeding was controlled by measures including fluid resuscitation and medication administration (somatostatin and proton pump inhibitors). After recovering from a first episode of EV bleeding, all patients received secondary prevention of EV bleeding according to sixth Baveno Consensus (Baveno VI) (3), namely, the combination of endoscopic variceal ligation and non-selective beta blocker. Moreover, all hepatitis B-related cirrhotic patients received antiviral therapy.
Endoscopic examination was performed by experienced endoscopists. During the examination, the form, location and bleeding signs of varices were noted, and the size of the varices was classified as small, medium or large corresponding to <30, 30-60, or >60%, respectively, of the maximum theoretical size (17).

CT Image Acquisition and Analysis
Contrast-enhanced CT scans were performed using a 320detector CT scanner (Aquilion ONE, TOSHIBA) and a 64detector CT scanner (Discovery, GE Healthcare). Non-enhanced CT scans were first acquired, followed by three post-contrast CT scans in three phases: arterial, portal vein and delayed. Arterial phase scanning started ∼20-30 s after injection, portal phase scanning was started 30-40 s after the beginning of the arterial phase, and delayed phase scanning was started 40-60 s after the beginning of the portal phase scanning. The following parameters were used: tube voltage, 120 kV; tube current, 150-600 mAs; 80 × 0.5 mm or 64 × 0.625 mm detector collimation, matrix, 512 × 512; slice thickness, 5 mm; and pitch, 1.388 or 0.984. All patients received an intravenous, non-ionic contrast medium (iodine concentration, 370 mg/mL; volume, 1.5-2.0 mL/kg of body weight; Omnipaque 350, GE Healthcare, Shanghai, China) at a rate of 3-5 mL/s. Two imaging-based indexes including diameters of portal vein and spleen vein were assessed.

Image Segmentation and Radiomics Feature Extraction
Regions of interest (ROIs) were drawn around the whole liver and spleen slice-by-slice using 3D-slicer software version 4.10.2 (Boston, USA) by a radiologist (Reader 1, Z.X.Y.) with 12 years of working experience in abdominal imaging. In ROIs of the liver and spleen, each ROI was as close as possible to the margin but excluded large vascular structures and artifacts to avoid adjacent organs such as the gallbladder, intestine, stomach, kidney and mesentery (Figure 1).
After image segmentation, we used Python 3.8.3 based on pyradiomics (version 3.0; https://pyradiomics.readthedocs.io/en/ latest/index.html) for feature extraction. A total of 2,264 features were extracted from the liver and spleen ROIs (1,132 from each organ). Furthermore, the images of 173 patients were segmented by another radiologist (Reader 2, W.X.M.) who specialized in abdominal imaging and had 26 years of working experience to evaluate reproducibility. Reader 1 outlined the ROIs again after 1 month to minimize recall bias. The interobserver reproducibility and intraobserver reproducibility of all extracted features were evaluated by intra/interclass correlation coefficients (ICCs). Features with ICCs > 0.75 were considered to have good reproducibility. In the reproducibility analysis, a total of 1,882 features (953 from liver images and 929 features from spleen images) were found to be sufficiently reproducible and stable (ICCs > 0.75).

Radiomics Feature Selection and Rad-Score Calculation
The radiomics workflow is presented in Figure 2. The training cohort was used for feature selection and model building, while the validation cohort was used to test model performance. To select the best features and avoid overfitting from the training cohort, we used the least absolute shrinkage and selection operator (LASSO) method and conducted 100 iterations of 10-fold cross-validations to develop a Lasso Cox regression model (18,19). The coefficients of some features were decreased to zero by adding penalty terms through Lasso Cox regression, and the features with non-zero coefficients were then selected. Moreover, the optimal tuning parameter (λ) is the value for which the partial likelihood deviance is the minimum criterion. The significant features were weighted with their coefficients and summed to form the Rad-score (Rad-score = coefficient 1 × feature 1+ coefficient 2 × feature 2. . . ) (20). The Rad-score Liver , Rad-score Spleen , and Rad-score Liver−Spleen were calculated by a linear combination of the selected features from the liver, spleen and a combination of both organs that were weighted by their own coefficients in the LASSO Cox regression model.

Assessment and Performance of the Rad-Score
Harrell's concordance index (C-index) and the hazard ratio (HR) were calculated to evaluate the predictive accuracy of the Rad-score. Kaplan-Meier survival analysis and the logrank test were used to evaluate the stratification ability of each model. In addition, calibration curves were generated to assess the calibration of the Rad-score. Decision curve analysis (DCA) was performed to analyze the clinical usefulness of the Rad-score by measuring the net benefit at different threshold probabilities.

Statistical Analysis
Statistical analysis was conducted with R software (version 3.6.1; http://www.r-project.org). The following R packages were used: glmnet, for running LASSO Cox; psych, for calculating ICCs; survival, for building the Cox proportional risk model and drawing Kaplan-Meier curves; hmisc, for calculating the C-index; rms, for generating calibration curves; stdca, for plotting DCA results; stats, for Mann-Whitney U and chisquare-tests; survcomp, for comparison of different C-indexes; and SurvProb, for predicting EV rebleeding probabilities. All statistical tests were two-sided, and p-values < 0.05 were considered significant.

Study Patients
A total of 173 patients were divided into a training set and a validation set at a ratio of 7:3 with a random sampling method; 121 patients constituted the training cohort, and the other 52 constituted the validation cohort. There was no significant difference in clinical characteristics between the two cohorts (p = 0.212-0.868; Table 1). During the follow-up periods, rebleeding occurred in 39 of 173 patients (22.5%) within 1 year.

Radiomics Feature Extraction, Selection, and Rad-Score Calculation
After extracting features from ROIs, we obtained 7, 6, and 10 features with non-zero coefficients as the predictive radiomics features for the liver, spleen and both organs, respectively. The Rad-score formulas are follows:

Univariate and Multivariate Cox Regression Analysis of Rad-Score and Clinical Characteristics
Predictive factors for EV rebleeding are summarized in Table 2.

Performance of the Rad-Score for EV Rebleeding
To compare the predictive performance of Rad-score Liver , Radscore Spleen and Rad-score Liver−Spleen for EV rebleeding, the Cindex was calculated. Rad-score Liver−Spleen showed significantly better performance than Rad-score Liver and Rad-score Spleen , yielding a C-index of 0.853 (95% CI = 0.776-0.904) in the training cohort and 0.822 (95% CI = 0.749-0.875) in the validation cohort ( Table 3). The calibration curves of the Radscore Liver−Spleen at 3, 6, 9, and 12 months showed that the predicted probability was very close to the actual probability (Figures 4A,B). DCA showed that the Rad-score Liver−Spleen yielded more clinical net benefit under almost all threshold probabilities, indicating that the Rad-score Liver−Spleen is more practical than the Rad-score Liver and Rad-score Spleen for predicting esophageal variceal rebleeding in cirrhotic patients (Figures 4C,D).
We also used clinical indexes, including HVPG, Child-Pugh score, MELD score and EV size, to predict the probability of rebleeding. Compared with the clinical indexes, the Radscore Liver−Spleen exhibited a higher C-index in the training and validation cohorts ( Table 3).

Risk Stratification for Predicting EV Rebleeding According to Rad-Score Liver-Spleen
Based on the cutoff values of the Rad-score Liver−Spleen determined by X-tile software (21) Figures 5D,E) cohorts. The 12-month rebleeding probabilities among the 3 risk groups in the training cohort were 0.090, 0.202, and 0.407. Likewise, significant differences were observed in the validation cohort (12-month rebleeding probability: 0.097 for the low-risk group, 0.218 for the intermediate-risk group, and 0.436 for the high-risk group; Table 4). Kaplan-Meier curves showed that the cumulative incidences of rebleeding in the training and validation cohorts were accurately differentiated by the risk stratification system (Figures 5C,F).

Subgroup Analysis for Predicting EV Rebleeding in Hepatitis B Virus and Non-HBV Cohorts
For the subgroup analysis in the training cohort, the Radscore Liver−Spleen in the HBV group had significantly better performance than that in the non-HBV group (C-index, 0.903 vs. C-index, 0.791; P < 0.001). Significant differences were also observed in the validation cohort (C-index, 0.884 vs. C-index, 0.781; P < 0.001, Table 5).

DISCUSSION
Non-invasive tools for predicting EV rebleeding and risk stratification have been highlighted in recent years. The present FIGURE 4 | Calibration curves and decision curve analysis of the Rad-score Liver−Spleen . Calibration curves of the Rad-score Liver−Spleen demonstrate its predictive performance for rebleeding at 3, 6, 9, and 12 months in the training cohort (A) and validation cohort (B). Decision curve analysis was performed to compare the performance of the Rad-score Liver−Spleen , Rad-score Liver and Rad-score Spleen in the training cohort (C) and validation cohort (D).
study developed a Rad-score extracted from features of both the liver and spleen to predict EV rebleeding. Our results showed that Rad-score Liver−Spleen was an independent significant predictive factor and achieved great predictive performance. In addition, Rad-score Liver−Spleen could stratify patients into low-, intermediate-and high-risk groups for predicting rebleeding probability. Thus, the Rad-score Liver−Spleen might be a promising tool to predict EV rebleeding in cirrhotic patients. Based on the results of the LASSO Cox regression analysis, a total of 10 potential radiomics features were selected to calculate the Rad-score Liver−Spleen . Among these, run entropy, run variance and high gray level run emphasis measured the randomness and variance in the distribution of run lengths or higher gray-level values. Consistent with previous studies (14), a higher absolute value of high gray level run emphasis increased the possibility of EV bleeding. Joint energy and inverse variance were measures of homogeneous patterns in the image; if the image texture was relatively uniform and changed slowly between different regions, the inverse variance was increased. These features had a proper ratio for calculation of the Rad-score that could avoid overfitting and mainly reflected the texture complexity of the liver and spleen (15).
Our study revealed that Rad-score Liver , Rad-score Spleen , and Rad-score Liver−Spleen were independent risk factors for EV rebleeding, suggesting that radiomics features of the liver and spleen were closely related to variceal bleeding. It was consistent with previous studies reporting that radiomics has a potential role in diagnosing portal hypertension and EV bleeding (14,15,22,23). This finding could be explained by the hepatic-related factors and splenomegaly contributed to the rise of portal pressure in cirrhotic patients (24,25). EV size and HVPG were not   independent predictors in our study, which might be explained by the fact that non-selective beta blocker treatment can decrease portal blood flow and variceal pressure, leading to a change in hemodynamics. Endoscopy and HVPG measurement which were reported to be predictors for EV rebleeding (4,5,26), are highly limited by their invasiveness and are therefore not suitable for dynamic monitoring. In contrast, Rad-score Liver−Spleen is noninvasive and reproducible, it can extract quantitative features that reflect information related to all directions of the complex spatial structure of organs that are invisible to the human eye. Clinical physicians need only to upload CT images and select the ROI of the liver and spleen to perform the radiomics analysis and help to assess the risk of rebleeding in cirrhotic patients.
Cirrhotic patients usually undergo endoscopy every 3-6 months (16) after successful eradication of the varices. In order to reduce or avoid endoscopy examinations, it is of great significance for physicians to determine appropriate candidates for endoscopy according to risk stratification. In this study, Rad-score Liver−Spleen divided all patients into low-, intermediate-and high-risk groups (3). Patients in the lowrisk group could avoid endoscopy, while for patients in the high-risk group, endoscopy was performed as soon as possible to prevent rebleeding. For patients in the intermediate-risk group, regular follow-up should be carried out every 3-6 months until the Rad-score Liver−Spleen reached the standard of the high-risk group.
In our study, 60.1% of patients had been infected with HBV, which remains the primary cause of cirrhosis in most Asian nations (27). Our results showed that Rad-score Liver−Spleen had a significantly better performance in the HBV group than that in the non-HBV group, indicating that Rad-score Liver−Spleen was particularly more suitable for the HBV population than for the non-HBV population.
There are several limitations to this study. First, this study was a single-center, retrospective analysis and subjected to the inherent limitations of such investigations. A multicenter, prospective study with a larger data set is needed. Second, we lacked hemodynamic data of the left gastric vein, portal vein, spleen vein, liver stiffness and spleen stiffness by ultrasound and transient elastography, which have proven to be good predictors of the degree of cirrhosis and the development of EV bleeding (28,29). A future study comparing radiomics with other radiologic methods is needed.

CONCLUSIONS
Our findings demonstrated that the Rad-score Liver−Spleen could be used to predict the probability of EV rebleeding and stratify cirrhotic patients accordingly. The Rad-score Liver−Spleen might serve as a useful tool for clinicians involved in therapeutic decision-making and individualized patient counseling.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding authors.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Biomedical Research Ethic Committee of Shandong Provincial Hospital. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements. Written informed consent was not obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.

AUTHOR CONTRIBUTIONS
DM and XZ devised the experiment. DM and YW developed and organized the paper. XF and BK designed the tables and figures. XW and JQ performed the data analysis. XZ and QZ participated in the revision of the manuscript. DM and XZ wrote the original draft. All authors read and approved the final manuscript.

FUNDING
This work was financially supported by the National Natural Science Foundation of China (81770607).