A Novel Multimodal Radiomics Model for Predicting Prognosis of Resected Hepatocellular Carcinoma

Objective To explore a new model to predict the prognosis of liver cancer based on MRI and CT imaging data. Methods A retrospective study of 103 patients with histologically proven hepatocellular carcinoma (HCC) was conducted. Patients were randomly divided into training (n = 73) and validation (n = 30) groups. A total of 1,217 radiomics features were extracted from regions of interest on CT and MR images of each patient. Univariate Cox regression, Spearman’s correlation analysis, Pearson’s correlation analysis, and least absolute shrinkage and selection operator Cox analysis were used for feature selection in the training set, multivariate Cox proportional risk models were established to predict disease-free survival (DFS) and overall survival (OS), and the models were validated using validation cohort data. Multimodal radiomics scores, integrating CT and MRI data, were applied, together with clinical risk factors, to construct nomograms for individualized survival assessment, and calibration curves were used to evaluate model consistency. Harrell’s concordance index (C-index) values were calculated to evaluate the prediction performance of the models. Results The radiomics score established using CT and MR data was an independent predictor of prognosis (DFS and OS) in patients with HCC (p < 0.05). Prediction models illustrated by nomograms for predicting prognosis in liver cancer were established. Integrated CT and MRI and clinical multimodal data had the best predictive performance in the training and validation cohorts for both DFS [(C-index (95% CI): 0.858 (0.811–0.905) and 0.704 (0.563–0.845), respectively)] and OS [C-index (95% CI): 0.893 (0.846–0.940) and 0.738 (0.575–0.901), respectively]. The calibration curve showed that the multimodal radiomics model provides greater clinical benefits. Conclusion Multimodal (MRI/CT) radiomics models can serve as effective visual tools for predicting prognosis in patients with liver cancer. This approach has great potential to improve treatment decisions when applied for preoperative prediction in patients with HCC.


INTRODUCTION
Hepatocellular carcinoma (HCC) is the most common primary liver tumor, accounting for 75%-85% of liver cancers (1). HCC is the second most common cause of cancer death worldwide and has high morbidity and mortality rates (2). Surgical resection and local ablation remain the most commonly used radical treatment methods for HCC; however, tumors recur in 70% of cases after hepatectomy and 25% of cases after liver transplantation, and the 5-year overall survival (OS) rate is only approximately 25%-55% (3)(4)(5). Hence, patients with HCC have a poor prognosis after surgery, and the high disease recurrence rate represents a great challenge to successful treatment (3,6). Therefore, the identification of reliable predictors of early recurrence is critical for patient risk stratification, support for treatment decisions, and improvement of long-term survival.
At present, relevant tumor factors, such as lesion diameter, cirrhosis, multifocality, poorly differentiated tumor, and microvascular invasion (MVI), are recognized as risk factors for early disease recurrence (7)(8)(9)(10); however, most of these features can only be evaluated by postoperative histopathological examination, which is invasive, and the results are prone to a missed diagnosis. In oncology, the application of radiomics, which involves the transformation of traditional medical images into high-dimensional, quantitative, and exploitable imaging data, enables in-depth characterization of tumor phenotypes and has the potential to provide information on intra-tumor heterogeneity and predict posttreatment survival (11,12). Multimodal machine learning is a method to process and interpret multimodal information through machine learning. Multimodal fusion is used to fuse multimodal information and perform targeted prediction classification or regression problems (13)(14)(15). Medical imaging can include data in different forms, such as CT, MRI, PET, ultrasound, and X-rays. In different guidelines, either CT or MRI is proposed as the best imaging modality for the diagnosis of HCC (16)(17)(18). Recent HCC management guidelines recognize an increasing role for gadoxetic acid-enhanced MRI in early diagnosis and monitoring post-resection (19). CT or MRI can all confirm the diagnosis if a nodule larger than 1-cm diameter is found with typical vascular features of HCC (hypervascularity in the arterial phase with washout in the portal venous or delayed phase) (20). Further, both CT and MR functional scans can be useful as supplements to conventional plain scan and dynamic enhancement to improve the accuracy of follow-up evaluation of liver cancer (21). In recent years, several qualitative MRI and CT imaging features have been reported. Preliminary evidence suggests that radiomics features have the potential to predict OS and tumor recurrence in patients with HCC, for example, by assessing peritumor parenchymal enhancement, satellite nodules, and non-smooth tumor margins, which are noninvasive predictors of early HCC recurrence (22)(23)(24).
Multimodal fusion technology can be divided into pixel level, feature level, and decision level, which are used to fuse abstract features and decision results in original data (13)(14)(15). To date, radiomics has been successfully applied in the study of nasopharyngeal carcinoma, non-small cell lung cancer, and rectal cancer (25)(26)(27), demonstrating the great potential for the development of this approach; however, to our knowledge, the use of contrast analysis of CT-enhanced sequence and MRenhanced sequence data to assess patient prognosis remains rare. In this study, we combined these two novel imaging techniques and explored the performance of multimodal radiomics models derived from MR and CT image data for prognostic evaluation following HCC resection.

Patients
This study was approved by the Ethics Committee of the Affiliated Hospital of Qingdao University. Due to its retrospective nature, the need for patient written informed consent was waived. From February 2014 to December 2020, we collected information from 306 patients with liver cancer, and 135 patients with primary HCC were recruited, based on the following inclusion criteria: 1) pathologically confirmed liver cancer recorded in the medical records at our hospital and 2) CT and MRI examinations performed within the previous 2 weeks before hepatectomy. The exclusion criteria were as follows: 1) other preoperative treatments [transarterial chemoembolization (TACE)], targeted drugs, and radiofrequency ablation), except hepatectomy (n = 11); 2) incomplete clinicopathological report (n = 10); 3) CT image and MR image quality was poor, and the Abbreviations: HCC, hepatocellular carcinoma; DFS, disease-free survival; OS, overall survival; C-index, Harrell's concordance index; TACE, transarterial chemoembolization; ALT, alanine aminotransferase; AST, aspartate aminotransferase; TBIL, total bilirubin; ALB, albumin; AFP, alpha-fetoprotein; ROI, region of interest; ICCs, the intra-class coefficient and the inter-class correlation coefficient; GLCM, gray-level co-occurrence features matrix-based features; GLRLM, gray-level run-length matrix-based features; GLSZM, gray-level size zone matrix-based features; GLDM, gray-level dependence matrix-based features; Log, Laplace wavelet; LASSO, least absolute shrinkage and selection operator; KM, Kaplan-Meier; Radscore, radiomics score; MVI, microvascular invasion; BMI, body mass index; PV_TT, portal vein tumor thrombosis; PLT, platelet count; HBsAg, hepatitis B surface antigen status; PT, prothrombin time; NEUT, neutrophil count. lesion could not be recognized or the lesion image was less than three layers (n = 3); 4) lost to follow-up (n = 4); and 5) error occurred in the feature extraction process (n = 4). The final study population included 103 patients. The entire cohort was randomly divided into a training cohort (n = 73) and a validation cohort (n = 30) (ratio, 7:3). Training queues were used to build single-modal and multimodal radiomics models, which were evaluated using validation queues.

Clinical Endpoints and Follow-Up
The endpoints of this study were disease-free survival (DFS) and OS. DFS was measured from the date of surgery until disease progression, death from any cause, or the last visit in follow-up (censored), and nomograms were also built based on the DFS. Disease progression, including local recurrence distant metastasis, was confirmed by clinical examination and imaging methods such as abdominopelvic CT or MRI or was biopsyproven. OS was defined as the time to death from any cause. All patients were followed up after surgery. Serum alanine transaminase (ALT), aspartate transaminase (AST), total bilirubin (TBIL), albumin (ALB), and alpha-fetoprotein (AFP) levels were obtained. Liver ultrasound examination was performed monthly within the 3 months after surgery and once every 3 months thereafter. CT examination of the lungs and enhanced CT or MRI of the liver were performed every 3 months during the first 2 years and once every 6 months thereafter. The minimum follow-up period was 3 days after surgery, while the maximum follow-up time was 92.8 months.

CT Scanning Methods and Parameters
Three-stage enhanced scans of the upper abdomen were obtained using a German CT (SOMATOM Definition Flash, Siemens, Munich, Germany) and an American Discovery CT (GE Healthcare, Chicago, IL, USA). Scans ranged from the top of the liver to the lower edges of both kidneys. Scanning parameters were as follows: voltage, 120 kV; current, 200-350 mA; scanning layer thickness, 5 mm; layer spacing, 5 mm; and matrix, 512 × 512. For contrast-enhanced scanning, a double-barreled highpressure syringe was used to inject iohexol, containing 350 mg/ ml of iodine, via the peripheral vein (flow rate, 3.0 ml/s; dose, 1.5 ml/kg). The delay times for the arterial, venous, and equilibrium phases were 30, 60, and 120 s, respectively.

MRI Scanning Methods and Parameters
MRI scanning was conducted using a 3.0 T Signa HDXT MR superconducting apparatus and an 8-channel body-phase front coil. Rapid volume acquisition Liver Acquisition with Volume Acceleration (LAVA) imaging of the liver was conducted using the following parameters: repetition time (

Tumor Segmentation
The tumor region of interest (ROI) was manually delineated on multi-phase CT and MR images by a radiologist with more than 10 years of experience (Reader 1) using ITK-SNAP (version 3.6.0; http://www.itksnap.org) to segment each tumor CT stage and MR stage. A two-dimensional ROI of the largest section of the tumor was selected, outlined, and saved as an NII file. Two weeks later, Reader 1 randomly selected 50 HCC patients and delineated the ROI again to evaluate the intra-class correlation coefficient of ROI. Additionally, another radiologist (Reader 2) independently performed ROI mapping for the randomly selected 50 HCC patients to evaluate the inter-class correlation coefficient.

Image Preprocessing and Feature Extraction
At the beginning of extraction, pre-processing was necessary to improve discrimination between texture features. To eliminate the batch effect of different equipment, all the data were normalized through z-score standardization to a standard intensity range with a mean value of 0 and SD of 1, and the image slices were resampled to voxel size = 1 * 1 * 1 cm 3 . With the use of IBSI compliant AK software (Analysis Kit Software, version 3.3.0, GE Healthcare), 1,217 radiomics features were extracted from CT and MR images, including first-order statistical features, morphological features, gray-level cooccurrence features, matrix-based features (GLCM), gray-level run-length matrix-based features (GLRLM), gray-level size zone matrix-based features (GLSZM), gray-level dependence matrixbased features (GLDM), and (Log) Laplace wavelet changes. Furthermore, intra-class and inter-class correlation coefficients (ICCs) were used to evaluate the intra-observer and interobserver reproducibility of feature extraction. The intra-class correlation coefficient was calculated by comparing the ROI of Reader 1 twice. The inter-class correlation coefficient between the groups was evaluated by comparing the ROI of Reader 1 with that of Reader 2. When ICCs exceeded 0.75 both within and between observers, this feature was considered to have a good consistency. Finally, the ICC range for CT (Balance, Venous, and Artery) was 0.175-1, and 917 features with ICC > 0.75 were retained for each phase. The ICC range for MR (Balance, Venous, and Artery) was 0.256-1, and 946 features with ICC > 0.75 were retained.

Feature Selection and Model Construction
Features with ICC values > 0.75 both within and between groups were retained for further analysis. In the training set, features with p < 0.05 in univariate Cox regression analysis were retained, and Spearman's correlation analysis and Pearson's correlation analysis were applied to eliminate characteristics that were highly correlated (selected coefficient threshold |r| = 0.8). The least absolute shrinkage and selection operator (LASSO) Cox regression with 10-fold cross-validation was used for further feature screening. Then, features with non-zero coefficients selected by LASSO analysis were linearly weighted. Next, radiomics scores (Radscores) were calculated for each patient.
The Radscore was the result of the Cox regression radiomics model. It was the linear combination weighted by the corresponding LASSO coefficients of each feature selected of each patient, and patients were then divided into high-risk and low-risk groups, according to their best truncation value in each model and the labeled high-risk group (riskscore = 1) and the low-risk group (riskscore = 0). Kaplan-Meier (KM) analysis was used to plot DFS and OS curves, and the log-rank test was used to evaluate the differences between high-risk and low-risk groups. The same threshold was then applied to the validation queue. Cindex values were used to evaluate the performance of the model.

Nomogram Construction
First, univariate Cox analysis was used to analyze risk factors and screen for features with p < 0.05. Clinical factors with p < 0.05 and Radscore for CT and MRI data combined (Combined_radscore) were included in the multivariate Cox stepwise regression model, to investigate independent predictors of survival in HCC patients. Clinical factors and Combined_radscore (with p < 0.05) in the univariate Cox analysis were enrolled to establish a nomogram to predict patients' 2-year, 4-year, and 5-year survival rates. C-index values were used to evaluate the performance of the model, and calibration curves were generated and discrimination ability was quantified to compare predicted and actual survival rates.

Statistical Analysis
All statistical analyses were performed using R3. 5

Patient Characteristics
Patient demographics and clinicopathological features are presented in Table 1

Radiomics Signature Construction
Features retained after each feature dimension reduction are listed in Supplementary Table S1. Finally, for prediction of DFS, 7, 12, and 17 features were selected from CT, MRI, and their combined features, respectively, and used to build models. For prediction of OS, 8, 16, and 17 features were selected to establish the model from CT, MRI, and their combined features, respectively. The details of selected features of DFS and OS are included in Supplementary Figure S1 and Table S2. The calculated CT_radscore, MRI_radscore, and Combined_radscore were based on selected features. We performed the univariate Cox analysis to determine the role of clinical features of patients on DFS in HCC ( Table 2). Three clinical characteristics, namely, tumor diameter, liver capsule invasion, and MVI were identified by univariate analysis (p < 0.05). Clinical features with p < 0.05 were included in backward stepwise multivariate regression analysis. The results show that MVI was an independent predictor of HCC in the multivariable analysis (p < 0.05). We performed the univariate Cox analysis to determine the role of clinical characteristics on the OS of patients in HCC ( Table 3). Six clinical characteristics, namely, body mass index (BMI), tumor diameter, MVI, portal vein tumor thrombosis (PV_TT), platelet count (PLT), and Bleeding_volume were identified by univariate analysis (p < 0.05). Clinical characteristics with p < 0.05 were included in backward stepwise multivariate regression analysis. The results show that BMI, MVI, and Bleeding_volume were independent predictors of HCC in the multivariable analysis (p < 0.05). The clinical models were built based on clinical risk features, and the Clinical_score of each model was calculated.
Combined_radscore and clinical factors were included in univariate Cox regression for analyzing DFS, and factors with p < 0.05 were included in backward stepwise multivariate Cox regression analysis ( Table 4). The results show that Radscore and MVI were independent predictors of HCC in the multivariable analysis (p < 0.05). Combined_radscore and clinical factors were included in univariate Cox regression for analyzing OS, and factors with p < 0.05 were included in backward stepwise multivariate Cox regression analysis ( Table 5). The results show that Radscore, MVI, PLT, and Bleeding_volume were independent predictors of HCC in the multivariable analysis (p < 0.05). CT+MRI_Clinical Model was established based on significant clinical risk features and Radscore. CT+MRI+Clinical_score of the models were calculated.
CT_radscore, MRI_radscore, Combined_radscore, Clinical_score, and CT+MRI+Clinical_score were divided into a high-risk group and a low-risk group according to the optimal cutoff value of each group, and then DFS and OS KM curves were plotted. KM curves methods and log-rank test estimating DFS (Figure 1) in the training cohort showed that patients in the lowrisk group had significantly better outcomes than those in the high-risk group (all log-rank p < 0.05) using the model. We then performed the same analyses in the validation cohort. Each model had similar results in the validation cohort (p < 0.05).
KM curves methods and log-rank test estimating OS (Figure 2) in the training cohort showed that patients in the low-risk group had significantly better outcomes than those in the high-risk group (p < 0.05). We then performed the same analyses in the validation cohort, and similar results were observed.

Development and Assessment of a Radiomics Nomogram
To provide the clinician with a quantitative method to predict patients' probability of 2-year, 4-year, and 5-year DFS and OS and to demonstrate the incremental value of the radiomics signature for individualized assessment of DFS and OS, both radiomics nomograms were built in the training cohort ( Figures 3A, B).
For prediction of DFS, Radscore, tumor diameter, liver capsule invasion, and MVI were finally retained to establish a nomogram for DFS prediction ( Figure 3A), and BMI, tumor diameter, PV_TT, PLT, Bleeding_volume, and Radscore were retained for use in establishing the prognostic prediction nomogram for OS ( Figure 3B). The performance of each modal for predicting DFS and OS was evaluated by calculating C-index values ( Table 6). In DFS analysis, the CT+MRI+Clinical model showed the best performance in the training cohort (Cindex = 0.858; 95% CI, 0.811-0.

DISCUSSION
Previous studies have developed multimodal imaging models, using radiomics features determined by MR and CT to predict tumor prognosis (28). To our knowledge, the present study is the first to evaluate DFS and OS in patients with HCC using a contrastive learning analysis of enhanced CT and MRI sequence data. The main challenges faced by multi-pattern methods are how to judge the confidence of each mode and the correlation between modes, how to reduce the dimension of multi-pattern characteristic information, and how to register multi-pattern data collected asynchronously (13)(14)(15). We compared the advantages of multimodal radiomics models for CT and MRI integration.
Radiomics has recently received attention in the field of cancer research because it is a high-throughput method used to extract large numbers of radiomics features from standard medical imaging and can improve medical decisions (29). Radiomics is used to extract quantitative feature data that      reflect information related to tumor heterogeneity, which are not visible to the human eye. Hence, radiomics can provide a noninvasive, low-cost, and reproducible means to capture tumor phenotypes that may be associated with intra-tumor heterogeneity (30). To date, radiomics has been used in research to explore liver tumors, including numerous studies applied to the diagnosis, prognosis, pathological grading, and MVI of liver cancer (31)(32)(33)(34). Many previous studies have demonstrated the role of radiomics in survival assessment for patients with different types of cancer, including non-small cell lung, breast, and thyroid cancers (35)(36)(37).
We developed a new multimodal radiomics model to compare the value of enhanced CT and MRI sequence data for prognosis prediction in patients with HCC and to compare this with the predictive performance of clinicopathological factors. In this study, we extracted 1,217 features from CT and MR images and finally identified non-zero coefficient features associated with DFS and prognostic features associated with OS by LASSO regression analysis. Specific feature dimension reduction and features screening processes are also shown in the Supplementary Materials. Radscore values were calculated using these features. KM survival analysis methods and log-rank tests were used to evaluate their prognostic value.
In our study, the results of multivariate analyses showed that MVI, Bleeding_volume, and PLT were independent predictors of the prognosis of HCC patients, which was consistent with the results of previous studies (7-10). The CT+MRI+Clinical model was superior to that of a model comprising clinical features alone, CT alone, MRI alone, or CT+MRI combined model, indicating that the multimodal radiomics model approach may have a greater value in predicting DFS and OS of resected HCC. The multimodal model can provide more abundant information.
In addition, for all KM curves of predicting DFS and OS, the low-risk group had significantly higher survival times than the high-risk group (p < 0.05), indicating that Radscore was an independent predictor of HCC, and this finding was confirmed in the multivariate Cox proportional risk model (p < 0.05) in both DFS and OS. Thus, Radscore improves traditional prognostic ability and represents a potentially effective and promising tool for evaluating the prognosis of patients with HCC. This is consistent with the study by Zhao et al. (38). In a prior study, Zhang et al. (28) established single and multimodal logic models for predicting LVI, with excellent predictive power in training (area under the curve (AUC), 0.884; 95% CI, 0.803-0.964) and validation (AUC, 0.876; 95% CI, 0.721-1.000). Their results are similar to our study, but our model also included clinical factors. Univariate and multivariate Cox analyses were used to select clinical factors into the model to analyze the prognosis, which was more convincing and scientific by comparing the prediction performance of various modes, and it was shown in nomograms. Our Radscore-based nomograms yielded a better discriminative ability than these traditional methods for predicting prognosis in HCC patients.
Zhou et al. (24,38) extracted radiomics features from arterial and portal phase CT images of 215 HCC patients undergoing partial hepatectomy, screened the imaging features through a LASSO logistic regression model, and constructed a Radscore model. The results showed that inclusion of CT-based radiomics features with routine clinical variables significantly predicted early recurrence (≤1 year) postoperatively and that the diagnostic performance of the model combining radiomics and clinical factors was superior to that of the model with clinical features alone for estimating early recurrence. It seems to be obvious that assessing tumorous disease with single modal radiomics information will not be comprehensive. However, the development of methods and strategies for the integration of information of different dimensions is still in its early stages, and combining prediction models, as performed in the current study, might increase their precision and could be extended to other diagnostic indicators. Further research following this scheme is warranted.
This study has several limitations. First, our study was conducted in a single institution. Although all CT and MR images were obtained using a uniform scanner and standardized imaging acquisition sequences, to reduce bias and variance in our results and improve the robustness of the model, further confirmation using patient data from other institutions is needed. Second, the use of manually drawn two-dimensional ROI is time-consuming and inconvenient for clinical application; hence, the feasibility of automatic segmentation or semisegmentation in radiomics analysis will be the focus of future research. Third, the number of patients in this study is not large because not all HCC patients need to undergo CT and MR in clinical practice. In addition, the cost of conducting CT and MR at the same time is relatively expensive, so there are some obstacles to implementation. Finally, our single-center study primarily included patients who had undergone CT and MR, with a small sample size. We will work with other hospitals to explore the robustness of similar multimodal models in the future.
In conclusion, our results suggest that Radscore is an independent prognostic factor in patients with HCC. Multimodal imaging profiles have great potential to improve individualized assessment of likely prognosis after surgery and may guide the individualized care of patients with HCC.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Affiliated Hospital of Qingdao University. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.