Prediction of Post-hepatectomy Liver Failure in Patients With Hepatocellular Carcinoma Based on Radiomics Using Gd-EOB-DTPA-Enhanced MRI: The Liver Failure Model

Objectives: Preoperative prediction of post-hepatectomy liver failure (PHLF) in patients with hepatocellular carcinoma (HCC) is significant for developing appropriate treatment strategies. We aimed to establish a radiomics-based clinical model for preoperative prediction of PHLF in HCC patients using gadolinium-ethoxybenzyl-diethylenetriamine (Gd-EOB-DTPA)-enhanced magnetic resonance imaging (MRI). Methods: A total of 144 HCC patients from two medical centers were included, with 111 patients as the training cohort and 33 patients as the test cohort, respectively. Radiomics features and clinical variables were selected to construct a radiomics model and a clinical model, respectively. A combined logistic regression model, the liver failure (LF) model that incorporated the developed radiomics signature and clinical risk factors was then constructed. The performance of these models was evaluated and compared by plotting the receiver operating characteristic (ROC) curve and calculating the area under the curve (AUC) with 95% confidence interval (CI). Results: The radiomics model showed a higher AUC than the clinical model in the training cohort and the test cohort for predicting PHLF in HCC patients. Moreover, the LF model had the highest AUCs in both cohorts [0.956 (95% CI: 0.955–0.962) and 0.844 (95% CI: 0.833–0.886), respectively], compared with the radiomics model and the clinical model. Conclusions: We evaluated quantitative radiomics features from MRI images and presented an externally validated radiomics-based clinical model, the LF model for the prediction of PHLF in HCC patients, which could assist clinicians in making treatment strategies before surgery.


INTRODUCTION
Hepatocellular carcinoma (HCC) is one of the most common malignancies and the fourth leading cause of cancer death worldwide (1). Surgical resection is an effective curative treatment for HCC patients, which provides remarkable survival benefits (2). However, the postoperative mortality of hepatectomy is estimated to be about 3-14% (3), which is even higher in patients with preexisting chronic liver disease (4). The main reason for the high mortality is postoperative complications, including bleeding, incisional infection, and liver failure, etc. Among them, post-hepatectomy liver failure (PHLF) is a severe one and the major cause of death (4,5), with an incidence of 12% (6, 7) and a mortality of up to 50% (3). PHLF could also result in prolonged hospitalization, increased costs, and poor long-term prognosis (3). Therefore, the risk of PHLF needs to be assessed accurately before hepatectomy.
Previous literature has reported some clinical factors that contributed to PHLF in HCC patients, including tumor size (8), preoperative platelet (PLT) count (4,9), future liver remnant (FLR) (10), etc. Currently, commonly used clinical methods for preoperative liver function assessment include routine blood test, blood biochemistry test, indocyanine green (ICG) retention test, and clinical scores such as Child-Pugh (CP) score, model for end-stage liver disease (MELD) score, and albumin-bilirubin (ALBI) score (11)(12)(13). However, all these methods are limited to some extent with unsatisfactory performance, which is probably due to limitations of clinical variables. Therefore, there is an urgent need to explore a more effective means of predicting PHLF.
In recent years, as radiomics has evolved rapidly, it becomes increasingly promising in medical research. Quantitative features could be extracted from digital medical images using radiomics so that these high-dimensional data can be fully used for assisting clinicians in disease diagnosis, treatment strategy development, and prognosis assessment. Several studies have shown that texture features were significantly associated with overall survival (OS) and recurrence-free survival (RFS) of HCC patients after hepatectomy (14,15). It has also been found that radiomics features were associated with liver fibrosis and other pathologic features of HCC (16,17). Therefore, it is promising to apply radiomics to further assess the liver function of HCC patients after hepatectomy. However, there is a lack of study that researches the relationship between radiomics and PHLF.
In this study, we aimed to establish models for the prediction of PHLF in HCC patients using radiomics based on gadolinium-ethoxybenzyl-diethylenetriamine (Gd-EOB-DTPA)enhanced magnetic resonance imaging (MRI), which could potentially assist doctors in clinical decision making.

Patients
The retrospective study was conducted in two medical centers: First Affiliated Hospital of Sun Yat-sen University (FAHSYSU) and Sun Yat-sen University Cancer Center (SYSUCC). Patients from FASHSYSU were used as the training cohort for model development, while patients from SYSUCC were used as the test cohort for model validation. Patients who were diagnosed with HCC from January 2016 to December 2019 were screened. Inclusion criteria were: (1) underwent hemihepatectomy; (2) pathologically diagnosed with HCC; (3) received Gd-EOB-DTPA-enhanced MRI scan of the liver and ICG retention test within 30 days prior to surgery. Exclusion criteria were: (1) received any anti-tumor therapy before the surgery; (2) incomplete clinical or pathological information. In accordance with the International Study Group of Liver Surgery (ISGLS), PHLF is defined as impaired functions of the liver, which are characterized by hyperbilirubinemia and an increased international normalized ratio (INR) on or after postoperative day 5 (18).

MRI Image Acquisition and Evaluation
Details of MRI image acquisition are provided in Supplementary Materials and Supplementary Table 1.
Three independent radiologists with more than 20 years of experience reviewed the MRI images and evaluate the following features: transient hepatic parenchymal enhancement (THPE), tumor size, number of tumors, tumor boundary, tumor capsule, vascular invasion, bile duct invasion, bile duct dilatation, lymph node metastasis, adjacent tissue invasion, varicose veins, hemorrhage, tumor thrombus, liver cirrhosis, and splenomegaly.

Region-of-Interest Segmentation and Radiomics Feature Extraction
The radiomics workflow is shown in Figure 1. MRI images of patients were imported into the ITK-SNAP 3.6.0 software (opensource software; www.itksnap.org) (22). By using the software, the radiologists delineated a circular region of interest (ROI) of 1 cm in diameter in each non-tumor liver segment as indicated in the hepatobiliary phase on the transverse slice.
We then extracted and analyzed the MRI image features by using the A.K. 2.0.0 software (house-made software; Analysis-Kit, GE Healthcare). In total, 1,044 MRI image features of five categories were extracted, including seven shape features, 44 first order histogram features, 61 gray level size zone matrix (GLSZM) features, 446 gray level cooccurrence matrix (GLCM) features, and 486 gray level run-length matrix (GLRLM) features.

Development and Validation of the Radiomics Model
The feature analysis was performed by using open-source software (https://github.com/salan668/FAE). Original values of radiomics features were normalized, where each value subtracted the mean and then was divided by the L2 norm. Pearson's correlation coefficients of 1,044 radiomics features in each patient were then calculated. Radiomics features with a correlation coefficient higher than 0.86 were considered highly correlated and would be randomly eliminated, leaving only one. The recursive feature elimination (RFE) method was used for subsequent feature selection. It is much more robust to data overfitting than other feature selection techniques and has shown its power in many fields including radiomics, genomics, proteomics, and metabolomics (23). The best features were picked out from the whole by repeatedly constructing the model, and then remaining features were used to select the best features. This process was repeated until all features were traversed and the order in which features were eliminated in the process was the ordering of features. Thus, each feature was evaluated on their contribution to the model. After selection, a desired number of features were included in a logistic regression model. Finally, five-fold cross-validation was used to prevent overfitting.
Based on the radiomics model, a radiomics score that indicated the relative risk of PHLF for each patient in both the training cohort and the test cohort was calculated. The actual PHLF was determined by clinical evaluation as mentioned before. Then the performance of the radiomics model in each cohort was evaluated by plotting the receiver operating characteristic (ROC) curve and calculating the area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV).

Development and Validation of the Clinical Model and the Combined Model
In the training cohort, clinical variables were included in the univariate logistic regression analysis. The odds ratios (ORs) and corresponding 95% confidence intervals (CIs) for all variables were calculated. Each variable with a p-value lower than 0.2 was further included in the multivariate logistic regression analysis, in which the corrected ORs and the corresponding 95% CIs were calculated. A variable with a p-value lower than 0.05 was considered as an independent risk factor for PHLF.
Based on the selected clinical features, a clinical model was then established. After combining with the radiomics signature, a combined model called the liver failure (LF) model was then constructed. Five-fold cross-validation was used for the model's tuning. The performance of both models in each cohort was evaluated like the radiomics model.

Statistical Analysis
Categorical variables were compared by using Chi-square tests or Fisher's exact tests. For continuous variables, Kolmogorov-Smirnov tests and Levene tests were firstly used for testing normality of distribution and homogeneity of variance, respectively. Student's t-test was used if the variable satisfied both normal distribution and homogeneous variance, otherwise Mann-Whitney U-test was used. All statistical tests were twotailed. Python 3.6 software was used for all statistical analyses. Graphs were generated by using the matplotlib package and the seaborn package in python 3.6 software and the pheatmap package in R 3.4 software.

Patient Characteristics
A total of 144 HCC patients were included in the study, among whom 111 were from FAHSYSU (training cohort) and 33 were from SYSUCC (test cohort). For the training cohort, the number of PHLF and non-PHLF patients were 56 and 55, respectively. For the test cohort, the number of PHLF and non-PHLF patients were 15 and 18, respectively. Clinical characteristics of patients in two cohorts were shown and compared in Table 1.

Development, Performance, and Validation of the Radiomics Model
For each ROI of each patient, 1,044 radiomics features were extracted from the MRI image and then normalized. Pearson's correlation coefficients of a same feature of different ROIs from the same patient were calculated and all showed highly correlated with a value higher than 0.95. Thus, the average value of different ROIs for a same feature was calculated and used in subsequent analyses. Pearson's correlation coefficients of all 1,044 radiomics features in each patient were calculated ( Figure 2) and highly correlated features were randomly eliminated, from which a total of 864 features were left. Subsequently, 24 radiomics features were selected based on the RFE method, including three first order histogram features, 4 GLRLM features, and 17 GLCM features (Figure 3). The list of these 24 features was shown in Supplementary Table 2. Pearson's correlation coefficients of 24 radiomics features were calculated and shown in Figure 2. The ranks of selected radiomics features were shown in Figure 3.
A radiomics model that included these 24 features was then developed (Supplementary Materials), which showed satisfactory performance in the training cohort and the test cohort, with the AUCs of 0.900 (95% CI: 0.898-0.909) and 0.804 (95% CI: 0.792-0.845), respectively. The performance of the radiomics model in both cohorts was shown in Table 2 and Figures 4A,B.

Development, Performance, and Validation of the Clinical Model and the Combined Model
Based on the univariate logistic regression analysis that included clinical features, height, viral hepatitis, splenomegaly, AST, PLT count, INR, ICG-R15, tumor size, and MELD score were identified and further included in the multivariate logistic regression analysis. The multivariate analysis found that PLT count and tumor size were independent risk factors for PHLF, with ORs of 0.990 (95% CI: 0.983-0.997) and 1.347 (95% CI: 1.139-1.594), respectively. The results of univariate and multivariate logistic regression analysis were shown in Table 3.
A clinical model that included PLT count and tumor size were constructed, while a combined model called the LF model that incorporated these two features and the radiomics score was constructed ( Table 4). The performance of the clinical model and the LF model in both cohorts were shown in Table 5 and

DISCUSSION
In this study, we developed and validated the LF model that incorporated clinical and radiomics features for predicting PHLF in HCC patients, which had the best performance compared with the clinical model and the radiomics model. Although the clinical model included tumor size and PLT count could also predict PHLF in HCC patients, the performance was inferior to the radiomics model and the LF model. These results suggested that MRI images contained important information for liver function assessment and radiomics had huge potential in mining image information for PHLF prediction. Compared with other methods for liver function evaluation or prediction, our radiomics model has potential advantages in terms of convenience, effectiveness, and cost. Thus, the radiomics model could assist clinician in making treatment strategy.
Clinically, preoperative medical imaging is routinely performed for assessing tumor status and liver function in HCC patients. In fact, with the rapid development of medical imaging technology, it is increasingly important to obtain information on tumor areas and non-tumor areas from medical images. It has been shown that the FLR can be accurately measured before surgery by simulated resection assessment using 3D   computed tomography (CT) imaging (24) and that the ratio of CT-derived liver volume (CTLV) to standard liver volume (SLV) can be used to predict the prognosis of acute liver failure (25). However, the physical volume of the liver does not always reflect liver function, which could be affected by blood supply, liver cirrhosis, and other factors (26). MRI is another valid and commonly used imaging method for assessing liver function preoperatively, especially dynamic hepatocyte-specific contrastenhanced MRI (DHCE-MRI) with gadolinium-based contrast agents like Gd-EOB-DTPA. DHCE-MRI was found to be an ideal candidate for accurate determination of liver function before liver resection (27). Therefore, in the present study, we used Gd-EOB-DTPA-enhanced MRI as a valid preoperative imaging tool for PHLF risk assessment in HCC patients.
In recent years, due to the rapid development of related technologies, radiomics has become emergingly promising in medical research. Radiomics uses advanced computational methods to deeply explore the features of traditional images for cancer diagnosis, tumor staging, prognosis prediction, disease monitoring, and so on (28,29). In this study, based on radiomics, we deeply mined potential risk factors of PHLF from preoperative MRI images of HCC patients and developed prediction models. The radiomics model had great performance in the training cohort and the test cohort, which included a total of 24 selected radiomics features (including three first order histogram features, 4 GLRLM features, and 17 GLCM features) that potentially reflected the features of the non-tumor area under the influence of tumor. The first order histogram features reflect overall differences between MRI images from a lower hierarchical level. Both the GLRLM features and the GLCM features are texture features, which had been previously reported to have potential values in HCC (14,15). The GLRLM features are related to the grayscale distribution of the image, and the grayscale change of the image is indicative of the heterogeneity of the tissue. The GLCM features are a matrix describing the grayscale relationship between a pixel and its neighbors or pixels within a certain distance of a region. The GLCM features are further divided into five subclasses, including Cluster Prominence, Cluster Shade, Correlation, Joint Entropy, and Inverse Difference Moment (IDM). The Cluster Prominence reflects the abruptness of different tissues in MRI images and indicates abnormal features in liver and tumor tissues. The Cluster Shade is related to the symmetry of MRI images and suggests characteristic differences within normal liver tissue and between tumor and normal liver tissue. The IDM and Joint Entropy reflect the degree of regularity of the image texture. The lower the IDM, the higher the Joint Entropy, indicating the more irregular MRI image texture and the greater the tumor heterogeneity. In our study, tumor size and PLT count were found to be independent clinical risk factors of PHLF. Although it is generally accepted that patients with large tumor size or multiple tumors can still be considered as candidates for surgical resection, they are prone to develop PHLF due to potentially insufficient FLR after extensive resection. Ma et al. studied 2,613 patients who underwent hepatectomy and found that the incidence of PHLF was significantly higher in the group with tumor diameter ≥ 50 mm than the group with tumor diameter < 50 mm (8). Therefore, during the clinical management of HCC, tumor size should be accurately evaluated and considered to effectively prevent the occurrence of PHLF and the poor prognosis after resection. PLT count is one of the routine preoperative tests for surgical patients. PLT plays an important role in cooperating with hepatic sinusoidal endothelial cells and Kupffer cells to directly induce hepatocyte regeneration and improve liver function (30)(31)(32)(33). Preoperative thrombocytopenia was reported to be an important independent predictor for the morbidity and mortality of postoperative complications (34). Thus, the need of additional perioperative care in patients with thrombocytopenia has also been proposed (35). Ohkohchi et al. identified the clinical impact of PLT transfusion by demonstrating that platelet transfusion improved liver function in patients with chronic liver disease (36,37). In summary, insufficient PLT count is a significant risk factor for HCC patients, whose correction would bring about a significant reduction in the incidence of PHLF.
Our study had some limitations that need to be considered. Firstly, most included HCC patients in the current study were with hepatitis B, while only a few were with hepatitis C. In    western countries, however, hepatitis C virus infection and alcoholic steatohepatitis are the main causes of HCC. Secondly, since only HCC patients who underwent hemihepatectomy were included, the remnant liver volume was not used as a risk factor. In fact, hepatectomy with less than half of the liver being resected would have a significantly lower risk of PHLF (9). Thirdly, relevant studies have also reported that intraoperative factors such as intraoperative bleeding, intraoperative blood transfusion, and hepatic portal block were related to PHLF (38,39). As the aim of this study was to develop a preoperative prediction model for PHLF, intraoperative factors were not included. Fourthly, we used Gd-EOB-DTPA-enhanced MRI images in this study and the uptake of Gd-EOB-DTPA in HCC could be influenced by some factors, which need to be investigated in future studies. Finally, since this study was a retrospective study with a relatively small data set, the results need to be further validated in a large-scale prospective study.
In conclusion, our study, for the first time as we acknowledged, comprehensively evaluated radiomics features of MRI images in HCC patients and successfully established a radiomics-based clinical model for predicting PHLF, which could be potentially applied to assist treatment strategy development.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.