Multi-Sequence MR-Based Radiomics Signature for Predicting Early Recurrence in Solitary Hepatocellular Carcinoma ≤5 cm

Purpose To investigate the value of radiomics features derived from preoperative multi-sequence MR images for predicting early recurrence (ER) in patients with solitary hepatocellular carcinoma (HCC) ≤5 cm. Methods One hundred and ninety HCC patients were enrolled and allocated to training and validation sets (n = 133:57). The clinical–radiological model was established by significant clinical risk characteristics and qualitative imaging features. The radiomics model was constructed using the least absolute shrinkage and selection operator (LASSO) logistic regression algorithm in the training set. The combined model was formed by integrating the clinical–radiological risk factors and selected radiomics features. The predictive performance was assessed by the area under the receiver operating characteristic curve (AUC). Results Arterial peritumoral hyperenhancement, non-smooth tumor margin, satellite nodules, cirrhosis, serosal invasion, and albumin showed a significant correlation with ER. The AUC of the clinical–radiological model was 0.77 (95% CI: 0.69–0.85) and 0.76 (95% CI: 0.64–0.88) in the training and validation sets, respectively. The radiomics model constructed using 12 radiomics features selected by LASSO regression had an AUC of 0.85 (95% CI: 0.79–0.91) and 0.84 (95% CI: 0.73–0.95) in the training and validation sets, respectively. The combined model further improved the prediction performance compared with the clinical–radiological model, increasing AUC to 0.90 (95% CI: 0.85–0.95) in the training set and 0.88 (95% CI: 0.80–0.97) in the validation set (p < 0.001 and p = 0.012, respectively). The calibration curve fits well with the standard curve. Conclusions The predictive model incorporated the clinical–radiological risk factors and radiomics features that could adequately predict the individualized ER risk in patients with solitary HCC ≤5 cm.


INTRODUCTION
Hepatocellular carcinoma (HCC) is the sixth most common malignancy and the third leading cause of cancer-related mortality globally (1). In China, newly diagnosed cases of HCC account for almost half of the global cases annually, which seriously threatens the life and health of the Chinese people (2). For patients with early-stage HCC (solitary HCC ≤5 cm or up to three nodules ≤3 cm, without macrovascular invasion and extrahepatic spread) and adequate liver function (3), hepatectomy is still widely accepted as the first-line treatment option in most centers; in particular, early solitary HCC is an ideal surgical indication in clinical practice. Unfortunately, the long-term survival in patients with HCC remains unsatisfactory, with the 5-year recurrence rate at 50%-70% (4).
According to the current clinical practice guidelines, HCC recurrence is usually divided into early and late recurrence by the 2-year cutoff point (5)(6)(7)(8). Early recurrence (ER) accounts for more than 70% of tumor recurrence, which is likely caused by occult metastasis of the primary tumor (6). The time of recurrence is a significant survival factor, and the overall survival time for HCC patients with ER is often lower than for those without ER (8)(9)(10). Previous studies have reported several risk factors of ER, such as large tumor volume, multiple tumors, poor differentiation, satellite lesions, non-smooth tumor margins, vascular invasion, and peritumoral parenchymal enhancement in the arterial phase (AP) (11)(12)(13)(14)(15)(16).
Radiomics is a process of converting digital medical images into high-throughput, innumerable quantitative features using different algorithms, which provide valuable diagnostic, prognostic, or predictive information (17). To date, radiomics has been used to predict the postoperative ER of other types of cancer (18)(19)(20). As a non-invasive and effective tool, radiomics plays an important role in predicting ER of HCC after hepatectomy, transcatheter arterial chemoembolization, and radiofrequency ablation (15, 16, 21,), with relatively excellent diagnostic accuracy. However, few studies focused on radiomics analysis derived from multi-sequence MR images to predict postoperative ER of solitary HCC with a diameter ≤ 5 cm.
Previous studies showed that tumor diameter greater than 5 cm was closely related to ER and high mortality (22)(23)(24). However, few studies specifically predict ER of solitary HCC with a diameter ≤ 5 cm after hepatectomy. Therefore, it is very important to identify risk factors related to ER for guiding further clinical treatment and improving the long-term survival of HCC patients.
The aim of this study was to develop and validate an effective and visualized model based on multi-sequence MR images to predict ER in patients with solitary HCC ≤5 cm.

Patients
This retrospective study received ethical approval, and the requirement for informed consent was waived. From January 2012 to December 2017, 712 consecutive patients underwent R0 resection in our hospital. The inclusion criteria were the following: a) histologically proven HCC with a negative resection margin, b) solitary tumor ≤5 cm, c) no preoperative history of cancer-related treatments (including surgery and interventional therapy), d) high-quality MR images performed 4 weeks preoperatively (the lesions were clearly displayed without obvious external and respiratory motion artifacts), and e) at least 2 years of follow-up. Finally, a total of 190 HCC patients (80 patients with ER and 110 patients with non-ER) were included in this retrospective study. The enrolled patients were divided into a training set (56 patients with ER and 77 patients with non-ER) and a validation set (24 patients with ER and 33 patients with non-ER) at a ratio of 7:3 ( Figure 1).
The clinical and pathological variables were obtained from the electronic medical record system for all patients, including demographic characteristics, preoperative laboratory data, and postoperative pathological data.
Two abdominal radiologists (LW and BF with 3 and 6 years' experience, respectively) reviewed all MR images. Both radiologists were blinded to any clinical and pathological information. They reached a consensus through discussion when any disagreements existed. They independently evaluated and recorded the following basic MR image features: a) maximum tumor diameter (maximum diameter measured on axial MR images in the PVP), b) liver background (cirrhosis or non-cirrhosis), c) location (left lobe, right lobe, left and right lobes, or caudate lobe), d) intratumoral fat (presence or absence, defined as the signal in the opposed-phase reduced compared to the in-phase), e) DWI intensity (hyperintense or slightly hyperintense), f) capsule (complete or absent/incomplete), g) dynamic enhancement pattern (gradual enhancement, persistent enhancement, wash in and wash out, or minimal/no enhancement), h) tumor margin (smooth or non-smooth), and Abbreviations: AFP, alpha-fetoprotein; AP, arterial phase; APHE, arterial peritumoral hyperenhancement; AUC, area under the curve; DCE, dynamic contrast-enhanced; DP, delayed phase; DWI, diffusion-weighted imaging; ER, early recurrence; HCC, hepatocellular carcinoma; ICC, interclass correlation coefficient; LASSO, least absolute shrinkage and selection operator; MVI, microvascular invasion; NPV, negative predictive value; PPV, positive predictive value; PVP, portal venous phase; ROC, operating characteristic curve; T2WI/FS, fat-suppression. i) arterial peritumoral hyperenhancement (APHE; defined as relatively high intensity of the liver parenchyma outside the tumor boundary in AP that became isointense in the subsequent phases) (12).

Tumor Segmentation and Radiomics Feature Extraction
T2WI/FS images and three-phase DCE-MR images were used for feature extraction. Before tumor segmentation, all preoperative MR images were resampled into a uniform voxel size of 1 × 1 × 1 mm 3 using Artificial Intelligence Kit software (version 3.3.0, GE Healthcare, China). Three-dimensional manual segmentation was performed by a radiologist with 3 years' MR experience using ITK-SNAP software (v.3.6.0;www.itksnap.org;open-source software). The volumes of interest (VOIs) were manually drawn along the boundary of the tumor on each consecutive slice for all 190 lesions. To assess the intraclass correlation coefficient (ICC), 40 VOIs were randomly chosen and performed independently by another radiologist with 6 years' experience. In total, 1,316 radiomics features were extracted from each sequence using the Artificial Intelligence Kit software based on the opensource Pyradiomics python package, which included the following parameters: first-order histogram features (n = 18), texture features (n = 89, including 14 shape features, 16 graylevel zone size matrix (GLZSM) features, 16 gray-level runlength matrix (RLM) features, 24 gray-level co-occurrence matrix (GLCM) features, 14 gray-level dependence matrix features, and 5 neighboring gray-tone difference matrix features), wavelet features (n = 744), local binary pattern features (n = 279), and Laplacian of Gaussian (logSigma = 2.0/3.0) features (n = 186).

Radiomics Feature Selection and Signature Construction
Features with ICC > 0.75 indicated satisfactory consistency and were retained for subsequent analysis. The least absolute shrinkage and selection operator (LASSO) logistic regression algorithm was used to identify the most predictive radiomics features, and 10-fold cross validation was used to tune the model parameter as the inner resampling loop ( Figure 2). The radiomics score (Rad-score) was calculated via the linear combination of the selected features weighted by their respective LASSO coefficients. Considering the small sample size of our datasets, this radiomics model was further verified by using 100-time bootstrap for the outer resampling loop. The whole dataset was randomly divided into the training set and validation set 100 times. The existing radiomics model was tested on the new 100 testing datasets.
Clinical-radiological variables with p < 0.05 in the univariate analysis were included in the multivariate logistic regression analysis to confirm risk factors associated with ER, and the clinical-radiological model was generated. A combined model was developed by incorporating the clinical-radiological risk factors and the Rad-score. Receiver operating characteristic (ROC) curves were generated for those three models (a clinical-radiological model, a radiomics model, and a combined model). Accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and the area under the ROC curve (AUC) were calculated.

Follow-Up
Serum alpha-fetoprotein (AFP) levels and contrast-enhanced CT/MRI were performed every 3-6 months for 2 years after surgery. ER was defined as intrahepatic tumor relapse (with typical HCC imaging features or confirmed by pathology) and metastasis (distant metastasis or lymph node metastases) within 2 years after surgery.

Statistical Analysis
Categorical variables were compared by using the chi-square test or Fisher's exact test, and continuous variables were compared by using Student's t-test or the Mann-Whitney U test, as appropriate. All statistical analyses were performed with SPSS software (version 25.0, IBM). The performance of each model was compared using the Delong test. A two-sided p < 0.05 indicated a statistical significance.

Clinical Characteristics
Overall, 190 HCC patients (male/female, 163/27; mean age, 54.86 ± 9.03 years; age range, 27-80 years) who met the inclusion criteria were included and divided into the training set (n = 133; male/ female, 112/21; mean age, 54.39 ± 9.04 years) and validation set (n = 57; male/female, 51/6; mean age, 55.95 ± 8.98 years). Eighty (42.1%) of 190 patients with solitary HCC ≤5 cm experienced postoperative ER. Of the 80 patients with ER, cirrhosis was presented in 63 patients, and cirrhosis was strongly associated with ER in both the training set (p = 0.011) and validation set (p = 0.001). Except for prothrombin time (p = 0.047), no statistical difference was observed between the two sets in the clinical and radiological characteristics (all p > 0.05), as shown in Tables 1a, 1b.

Clinical-Radiological Model Construction and Validation
Univariate analysis showed that eleven clinical and radiological characteristics including age, cirrhosis, enhancement pattern, nonsmooth tumor margin, APHE, T stage, microvascular invasion (MVI), satellite nodules, serosal invasion, gamma-glutamyl transpeptidase level, and albumin level were significantly different between the ER and non-ER groups in the training set (all p < 0.05

Radiomics Model Construction and Validation
Among 1,316 radiomics features extracted from multi-sequence MR images, the LASSO analysis selected 12 features with nonzero coefficients to calculate the Rad-score (two, two, two, and six features from T2WI, AP, PVP, and DP images). The following formula was used to obtain the corresponding Rad-score for each patient: Rad-score=-0.3*T2_original_shape_Sphericity-0.157*AP_wavelet_HLL_glcm_ClusterShade+0.363*DP_lbp _3D_k_glrlm_ShortRunLowGrayLevelEmphasis-0.223* AP_wavelet_LHH_glszm_HighGrayLevelZoneEmphasis  Table 3.
For ER prediction, the combined model outperformed both the clinical-radiological model (p < 0.001) and the radiomics model (p = 0.023) in the training set. However, no significant difference was observed between the combined model and the radiomics model in the validation set (p = 0.174), although the combined model showed better performance than the clinicalradiological model in the validation set (p = 0.012). ROC curves for the prediction of ER were compared among the clinicalradiological, radiomics, and combined models (Figure 3).

Nomogram Construction and Validation
The combined model-based nomogram is presented in Figure 4.
The Hosmer-Lemeshow test yielded no significant difference in both the training and validation sets (all p > 0.05). The calibration curves ( Figure 5) revealed that the predictive probability of the nomogram was consistent with the actual ER probability in both sets. The decision curve ( Figure 6) showed that the combined model had the highest net benefit as compared P Intra indicates whether significant differences exist between the two groups. P Inter represents whether significant differences exist between the two sets. AFP, alpha-fetoprotein; ALT, alanine transaminase; AST, aspartate aminotransferase; LDH, lactate dehydrogenase; GGT, gamma-glutamyl transpeptidase; TBIL, total bilirubin; DBIL, direct bilirubin; IBIL, indirect bilirubin; TP, total protein; ALB, albumin; G, globulin; PLT, platelets; PT, prothrombin time; MVI, microvascular invasion; ER, early recurrence; IQR, interquartile range.    As an emerging quantitative analysis method, radiomics plays an important role in predicting ER of HCC after hepatectomy. However, as far as we know, there were few studies to investigate the relationship between radiomics characteristics based on  multi-sequence MR images and ER of single HCC ≤ 5 cm. Zhao et al. (25) found that the radiomics model based on multisequence MR images presents the best predictive ability compared with single sequence and other different sequence combinations. Additionally, Zhang et al. (16) developed and validated a radiomics nomogram for predicting ER using wholelesion radiomics features extracted from multi-sequence MR images. Their results indicated that the radiomics nomogram had a fairly good discriminative performance. Also, our result was consistent with the previous studies. Among all the features of the radiomics model, there were 2 features from T2WI and 10 features from DCE images that indicated that DCE images have more influence on the differential diagnosis of ER. The present study confirmed that the radiomics model based on the preoperative multi-sequence MR images (including T2WI/FS and DCE-MR images) had a higher predictive ability for ER than the clinical-radiological model with AUC values of 0.85 and 0.84 in the training and validation sets, respectively. This result indicated that radiomics features extracted from multi-sequence MR images might contain more biological and heterogeneity information than the clinical-radiological characteristics, which could further improve the predictive performance.
APHE is an auxiliary diagnostic feature of malignant tumors in the liver imaging reporting and data system. Previous studies have shown that APHE was more frequently observed in the ER group than in the non-ER group and was identified as an independent predictor of ER (11,13). The results of our study were consistent with previous studies. The possible reason may be that APHE was a feature associated with hypervascular progressed HCC and referred to as enhancement of the venous drainage area in the peritumoral liver parenchyma during multistep hepatocarcinogenesis (26). The non-smooth tumor margin has been proven to be closely related to tumor invasion and poor prognosis (16,27,28). Ariizumi et al. (29) reported that the incidence of portal vein invasion and intrahepatic metastasis in HCC patients with non-smooth margins was significantly higher than in patients with smooth margins. Additionally, their findings confirmed that the non-smooth margin was an important predictor of ER. In our study, the non-smooth tumor margin was also strongly correlated with ER. As an imaging biomarker with important clinical application value, the non-smooth tumor margin is closely related to tumor heterogeneity and invasive behavior, which leads to a higher probability of ER.
HCC is rare among patients without liver disease, and hepatitis B virus (HBV)-induced cirrhosis is the main risk factor for HCC (30). Yao et al. (31) found that cirrhosis was an independent risk factor associated with postoperative recurrence (p < 0.001). The incidence of cirrhosis in HCC patients with ER was higher than that in patients with late recurrence, but there was no significant statistical difference (p > 0.05). Portolani et al. (32) reported that cirrhosis was significantly associated with ER. Our results also showed that cirrhosis was an independent risk factor for ER. In this study, satellite nodules were defined as nodules that were invisible in images but presented around the primary tumors reported by postoperative pathology, and the presence of satellite nodules significantly predicted ER. In addition, the liver serosal invasion was an independent risk factor for postoperative ER in our study. Few studies have explored the relationship between serosal invasion and ER. Yamamoto et al. (33) reported that serosal invasion was associated with ER (p = 0.031). More studies are needed to confirm this conclusion in the future. Interestingly, in our study, MVI had no significant correlation with ER in the multivariate analysis, though it was a significant factor in the univariate analysis. Numerous studies reported that MVI was a significant risk factor associated with ER of HCC (25,(34)(35)(36). The discrepancy existed possibly because MVI was related to the aggressive behavior of the primary tumors. The frequency of MVI in HCC with a diameter less than 5 cm is significantly lower than in large or multifocal HCC as reported in previous studies (37)(38)(39). Another possible reason is that the HCC patients with a tumor diameter ≤5 cm generally undergo radical surgical resection, which may have a certain impact on reducing the risk of postoperative ER.
This study has several limitations. Firstly, selection bias was inevitable due to the retrospective nature. In order to increase the reliability, we applied the model obtained from the training set to the validation set. Secondly, our study was a single-center study from areas with a high incidence of HBV or hepatitis C virus infection, so this conclusion may not be applicable to other people with different liver diseases. Thirdly, we developed a prediction model only for ER and did not include late recurrence or long-term survival analyses because of the short postoperative follow-up time, which needs further investigation. Lastly, only patients with a single lesion ≤5 cm were recruited; therefore, this conclusion may not be extended to nodules with a maximum diameter >5 cm or multiple nodules. Thus, the results of this study need to be verified by more extensive and prospective studies in the future.

CONCLUSIONS
In conclusion, our findings showed that the combined model integrated clinical-radiological risk factors with the radiomics signature demonstrated good discriminative ability for predicting ER in HCC patients with a single nodule ≤5 cm, which may serve as a non-invasive and visualized tool in clinical decision-making. More multicenter, prospective studies will be needed to investigate the role of radiomics analysis in clinical practice in the future.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethics Committee of Cancer Hospital, Chinese