Ultrasound-Based Radiomics Analysis for Preoperatively Predicting Different Histopathological Subtypes of Primary Liver Cancer

Background Preoperative identification of hepatocellular carcinoma (HCC), combined hepatocellular–cholangiocarcinoma (cHCC-ICC), and intrahepatic cholangiocarcinoma (ICC) is essential for treatment decision making. We aimed to use ultrasound-based radiomics analysis to non-invasively distinguish histopathological subtypes of primary liver cancer (PLC) before surgery. Methods We retrospectively analyzed ultrasound images of 668 PLC patients, comprising 531 HCC patients, 48 cHCC-ICC patients, and 89 ICC patients. The boundary of a tumor was manually determined on the largest imaging slice of the ultrasound medicine image by ITK-SNAP software (version 3.8.0), and then, the high-throughput radiomics features were extracted from the obtained region of interest (ROI) of the tumor. The combination of different dimension-reduction technologies and machine learning approaches was used to identify important features and develop the moderate radiomics model. The comprehensive ability of the radiomics model can be evaluated by the area under the receiver operating characteristic curve (AUC). Results After digitally processing tumor ultrasound images, 5,234 high-throughput radiomics features were obtained. We used the Spearman + least absolute shrinkage and selection operator (LASSO) regression method for feature selection and logistics regression for modeling to develop the HCC-vs-non-HCC radiomics model (composed of 16 features). The Spearman + statistical test + random forest methods were used for feature selection, and logistics regression was applied for modeling to develop the ICC-vs-cHCC-ICC radiomics model (composed of 19 features). The overall performance of the radiomics model in identifying different histopathological types of PLC was moderate, with AUC values of 0.854 (training cohort) and 0.775 (test cohort) in the HCC-vs-non-HCC radiomics model and 0.920 (training cohort) and 0.728 (test cohort) in the ICC-vs-cHCC-ICC radiomics model. Conclusion Ultrasound-based radiomics models can help distinguish histopathological subtypes of PLC and provide effective clinical decision making for the accurate diagnosis and treatment of PLC.

Background: Preoperative identification of hepatocellular carcinoma (HCC), combined hepatocellular-cholangiocarcinoma (cHCC-ICC), and intrahepatic cholangiocarcinoma (ICC) is essential for treatment decision making. We aimed to use ultrasound-based radiomics analysis to non-invasively distinguish histopathological subtypes of primary liver cancer (PLC) before surgery.
Methods: We retrospectively analyzed ultrasound images of 668 PLC patients, comprising 531 HCC patients, 48 cHCC-ICC patients, and 89 ICC patients. The boundary of a tumor was manually determined on the largest imaging slice of the ultrasound medicine image by ITK-SNAP software (version 3.8.0), and then, the highthroughput radiomics features were extracted from the obtained region of interest (ROI) of the tumor. The combination of different dimension-reduction technologies and machine learning approaches was used to identify important features and develop the moderate radiomics model. The comprehensive ability of the radiomics model can be evaluated by the area under the receiver operating characteristic curve (AUC).
Results: After digitally processing tumor ultrasound images, 5,234 high-throughput radiomics features were obtained. We used the Spearman + least absolute shrinkage and selection operator (LASSO) regression method for feature selection and logistics regression for modeling to develop the HCC-vs-non-HCC radiomics model (composed of 16 features). The Spearman + statistical test + random forest methods were used for feature selection, and logistics regression was applied for modeling to develop the ICCvs-cHCC-ICC radiomics model (composed of 19 features). The overall performance of the radiomics model in identifying different histopathological types of PLC was moderate, with AUC values of 0.854 (training cohort) and 0.775 (test cohort) in the INTRODUCTION Primary liver cancer (PLC) is one of the most lethal and prevailing tumors, which is estimated to rank the fifth in cancer mortality among men and the seventh among women. In recent years, the incidence of PLC has continued to increase, rising faster than that of other cancers (1,2). In the same solid malignant neoplasm, PLC can be classified according to histological sources. A tumor that contains only cancerous hepatocytes is defined as hepatocellular carcinoma (HCC), only cancerous bile duct cells are defined as intrahepatic cholangiocarcinoma (ICC), and a mixture of HCC and ICC is defined as combined hepatocellularcholangiocarcinoma (cHCC-ICC) (3,4).
cHCC-ICC is a relatively rare subtype of PLC with a variably reported incidence between 0.4 and 14.2%, and its overall prognosis is worse than that of either HCC or ICC alone (5,6). Studies have revealed that in patients with PLC undergoing liver resection surgery, the survival outcome of cHCC-ICC is worse than that of HCC and that it is similar to or worse than that of ICC patients (7). HCC patients who meet the Milan criteria are indicated for liver transplantation, and their transplantation effect is excellent (8). However, increasing evidence indicates that the prognosis for cHCC-ICC patients undergoing liver transplantation is worse than that of patients with HCC alone and that cHCC-ICC is regarded as a relative contraindication for liver transplantation (9)(10)(11). Considering the scarcity of liver sources available for transplantation and the poor prognosis for cHCC-ICC, the correct identification of different PLC subtypes before surgery is a necessary condition for the reasonable selection of surgical candidates for liver transplantation and liver resection surgery, and it can improve overall survival outcomes (12,13). PLC is often diagnosed as advanced, and many patients do not qualify for a curable treatment; systemic treatments that are effective for either HCC or ICC alone appears to be ineffective for cHCC-ICC (5). Therefore, precise and proper preoperative diagnosis is important for patient management to distinguish cHCC-ICC from HCC and ICC since different PLC subtypes may determine different treatment decisions.
Due to the high heterogeneity in the proportion and existing forms of the two tumor components, the imaging manifestations of cHCC-ICC have lacked specificity. At present, most cases of cHCC-ICC are misdiagnosed as simple HCC or ICC. Theodora et al. showed that the liver imaging reporting and data system (LI-RADS) as a common method for qualitative diagnosis of liver tumors applied in liver-contrast-enhanced ultrasound (CEUS) diagnosis may misdiagnose 54.1% of cHCC-ICC lesions as HCC (14). In contrast-enhanced imaging, cHCC-ICC has overlapping imaging modes with HCC and ICC. The main tissue in the tumor largely determines the main imaging features, making it difficult to distinguish cHCC-ICC from HCC and ICC (15). Moreover, most tumors can be diagnosed with core needle biopsy before surgery, but due to the different proportions of ICC and HCC in cHCC-ICC and sampling error, even histological biopsy may lead to preoperative diagnosis error and misdiagnosis of cHCC-ICC as HCC or ICC (16). Therefore, although accurate preoperative diagnosis of the three subtypes of PLC is important, it is still difficult.
Radiomics, a newly emerging concept in recent years, uses computers to extract a large amount of non-visual quantitative image information to realize the extraction of tumor features and model establishment, and it further excavates and analyzes image data information to assist doctors in diagnosis (17). Through the radiomics approach, the features that can be identified by human eyes and extracted by computers build a complementary relationship; in addition, radiomics combined with currently effective clinical evaluation indicators can improve the accuracy of medical diagnosis (18,19). Tumor features vary from different tumor morphologies and biological behaviors. Radiomics as a method of deep mining high-dimensional image features can capture the characteristics of tumors more comprehensively, providing a feasible new method for identifying different tumors. Rafael et al. extracted 2D texture features and 3D texture features from T1-weighed MR images of 67 brain metastases and established a radiomics model using a random forest method. This model was helpful in distinguishing the primary tumors from brain metastases (breast cancer, lung cancer, and melanoma) (20). In the research by Yin et al., the radiomics model based on MR images can effectively identify different sacral tumors for preoperative identification of chordoma, giant cell tumor, and metastatic tumor (21).
Currently, the diagnosis of cHCC-ICC is usually based on postoperative pathology. Radiomics studies based on ultrasound evaluation of three different PLC subtypes are still lacking, and relevant reports have not been reported. In different imaging examinations, ultrasound technology has the advantages of no radiation, real-time observation, and simplicity with regard to liver disease examinations. An ultrasound-based radiomics approach may be better than other approaches in identifying three types of PLC to provide additional information. In this study, an ultrasound-based machine learning method was used to extract radiomics features and develop radiomics models to identify different pathological types of PLC.

Study Population
This study was approved by the Ethics Committee of the First Affiliated Hospital of Guangxi Medical University. A comprehensive retrospective research was implemented on the medical records of patients diagnosed with PLC after surgery in the First Affiliated Hospital of Guangxi Medical University from January 2017 to September 2019.
The following inclusion and exclusion criteria were implemented in this study. Inclusion criteria included the following: (1) the lesions were primary liver tumors; (2) the target nodule was confirmed by surgery pathology; (3) liver ultrasound examination was performed within 14 days before resection; and (4) the target lesions were displayed clearly on the ultrasound images. Exclusion criteria included the following: (1) anticancer treatment before surgery; (2) poor image quality; and (3) uncompleted clinical data.
Finally, 668 eligible patients (544 male/124 female; mean age, 50.5 ± 11.4 years; age range, 22-79 years) were enrolled (Figure 1). The pathological tissue of the lesions was obtained by surgical hepatic resection for pathological diagnosis to determine the histological classification of PLC, of which there were 531 HCC patients, 89 ICC patients, and 48 cHCC-ICC patients.

Patient Clinical Pathological Parameters
Basic patient information was collected including data on gender, age, tumor size, cirrhosis, hepatitis, and serum tumor markers. Serological data included carbohydrate antigen 19-9 (CA19-9), alpha fetoprotein (AFP), and carcinoembryonic antigen (CEA) levels. These data were measured within 2 weeks before surgery.
We also collected patient pathological information, including tumor differentiation, microvascular invasion (MVI), TNM stage, and immunohistochemical information on Ki67, p53, and vascular endothelial growth factor (VEGF). MVI referred to the observation of a nest of cancer cells in a blood vessel lining the endothelial cells by microscopy. In this study, the TNM staging of PLC patients was analyzed according to the American Joint Cancer Commission (AJCC) eighth edition staging system (22,23).

Radiomics Analysis
The research of radiomics mainly includes the following steps: tumor segmentation, data preprocessing and feature selection, modeling, and evaluation (Figure 2). In the training cohort, we performed a combination of different dimension-reduction technologies and machine learning approaches to establish radiomics models. Finally, the test cohort was taken to evaluate the generalization performance of the model. Aloka EZU-MT28-S1 ultrasound diagnostic instruments (Aloka, Japan, abdominal probe, 2-6 MHz) were used to collect images. We conducted a retrospective review of the image data and selected two-dimensional ultrasound images in digital imaging and communications in medicine (DICOM) format that clearly showed the largest cross section of each lesion. We imported the images into the ITK-SNAP software (version 3.8.0) 1 to manually draw the tumor boundary and determine the tumor region of interest (ROI) (Figure 3). Under the supervision of a radiologist with over 20 years of ultrasound diagnosis experience, another radiologist with 15 years of ultrasound diagnosis experience completed the ROIs for all tumors.

Feature Extraction and Data Preprocessing
Intelligence Foundry software (GE Healthcare, version 1.3) was used for radiomics analysis. Since the images were collected by different ultrasound equipment and the feature vectors had a wide range, we preprocessed the data before modeling analysis to improve the accuracy of the calculation, including ultrasound system supplier data alignment, median value replacement of missing value processing, and data normalization processing.
We used 256 as the bin size to discretize the gray value of the images and used the ComBat method to standardize the radiomics features. The ComBat method was previously used in radiomics studies of different PTE or MRI protocols (24,25). The wavelet features were based on the original gray value image for wavelet transformation (including HLH, LLL, and HHL, with eight local matrices); the energy, skewness, and other series of parameters were extracted from the obtained wavelet transform matrix. In the same way, the shearlet change and the gabor operator transformation were also carried out, and different step lengths were used in the change to obtain multiple sets of transformation intermediate value matrices. Based on the above transformations, the radiomics parameters were extracted, and finally, we obtained 5,234 high-throughput features. The types of features included the following: first-order features (energy, mean, skewness, kurtosis, etc.), shape features (minor axis length, major axis length, elongation, etc.), wavelet features and textural features [gray level co-occurrence matrix (GLCM) features, grey level run length matrix (GLRLM) features, etc] (Supplementary Part A). The feature parameters extracted by the Intelligence Foundry software (GE Healthcare, version 1.3) were algorithms provided using the pyradiomics package, which calculated the radiomics features in accordance with the feature definition described in the 2016 version of the image biomarker standardization initiative (IBSI) (26,27). The median was used to fill in missing extracted feature values and substitute abnormal value. Z-score normalization was used to convert different data to the same order of magnitude, and the calculation formula was as follows: where µ is the mean and σ is the standard deviation. The PLC patients were labeled according to different histological types into different layers. In the HCC-vs-non-HCC model, the non-HCC label was "0, " and the HCC label was "1." In the ICC-vs-cHCC-ICC model, the cHCC-ICC label was "0, " and the ICC label was "1." Then, PLC patients with different histological types were grouped based on a 7:3 ratio (training cohort : test cohort) in each layer using the method of stratified sampling. The training cohort was used to build the model, and the test cohort was an independent external verification to evaluate the model established by the training cohort.

Feature Selection
We obtained 5,234 high-throughput radiomics features and normalized the quantitative expression values of the radiomics features using the Z-score method. Considering that some highly correlated and redundant features in the data may affect the classification effect of the model, we calculated the Spearman correlation coefficient. A correlation coefficient between the two variables close to 1 indicated that the linear relationship between them was strong and that one of the two variables could be used instead of the other. In this study, the high-correlation features were removed with a threshold of 0.95 (HCC vs. non-HCC) and 0.75 (ICC vs. cHCC-ICC). Then, we used the statistical test method to screen for features that had differences.
Finally, we used four dimension-reduction technologies to further deal with the features that were processed above. Dimension-reduction technologies included random forest, maxrelevance and min-redundancy (mRMR), logistic regression, and support vector machine recursive feature elimination (SVM-RFE) (Supplementary Part B).

Modeling and Evaluation
The final selected radiomics features were imported into the classifier to build a model for evaluating three different histopathological types of PLC. Ten machine learning approaches were used in this study, which were decision tree, naïve Bayes, k-nearest neighbor (KNN), logistics regression, support vector machine (SVM), bagging, random forest, extremely randomized trees, AdaBoost, and gradient boosting tree (Supplementary Part B).
We extracted 5,234 features from the ultrasound images. We quantify the discriminative ability of the radiomics model by calculating the receiver operating characteristic curve (AUC). We constructed the model by separately combining the above four dimension-reduction technologies and the above 10 machine learning approaches and chose the combination with the highest AUC to build the optimal radiomics model. In the training cohort, to avoid overfitting the classifier, we used a 10-fold crossvalidation method.
We performed a receiver operating characteristic (ROC) curve analysis and calculated the accuracy and precision. We also used the confusion matrix as a measure of the quality of the machine learning approaches to verify whether the prediction results were consistent with the actual results. The confusion matrix is a useful tool for evaluating the classification ability of radiomics models (28).
In the HCC-vs-non-HCC and ICC-vs-cHCC-ICC radiomics models, we performed univariate and multivariate logistic regression analyses to analyze the relevant factors of different pathological types of PLC. Univariate analysis factors with P-values less than 0.1 were further analyzed by multivariate logistic regression analysis. In multivariate analysis, a P-value less than 0.05 was considered significant.

Statistical Analysis
R software (version 3.6.0) and SPSS software (version 22.0) were applied for statistical analysis. In the quantitative data with a normal distribution, the completely random design t-test was performed for the two-samples contrast, the analysis of variance was used to contrast several independent samples, and variables were summarized as the mean ± standard deviation (SD). For quantitative data with a skewed distribution, the Mann-Whitney U test was performed to compare two independent samples, the Kruskal-Wallis H test was used to compare several independent samples, and variables were summarized as the median (q1-q3). Qualitative data were compared using chi-square tests, with variables described as percentages. P-values below 0.05 was considered to be statistically significant differences. In the R software (version 3.6.3), the "pheatmap" package was used to draw heat maps of features.

Clinicopathological Data of PLC Patients
A total of 668 PLC patients were adopted in this research (Figure 1). The clinicopathological parameters of the training and test cohorts were shown in Table 1. There were no significant differences in the distribution of clinicopathological features between the two cohorts, including gender, age, tumor size, hepatitis, cirrhosis, serum tumor markers, pathological subtype, immunohistochemistry, or tumor stage. These results showed the rationality of our training and test cohort partitions. In

Identification of the Radiomics Signature
In the HCC-vs-non-HCC group, we used the LASSO regression method for dimension reduction and modeling with the logistics regression method. In the ICC-vs-cHCC-ICC group, we used the random forest method for dimension reduction, feature selection with a threshold value of 1.25 times the mean value, and modeling with the logistics regression method. Finally, we respectively identified 16 and 19 optimal radiomics features for HCC-vs-non-HCC model and ICC-vs-cHCC-ICC model predictions ( Table 2). Figure 4 showed the heat map of 16 features (HCC-vs-non-HCC model) and 19 features (ICC-vs-cHCC-ICC model) of the final radiomics models.

Radiomics Model Assessment
The results showed that the radiomics models we built had a high overall classification performance for identifying three subtypes of PLC. The AUC values in the training cohort and test cohort were 0.854 and 0.775 (HCC vs. non-HCC) and 0.920 and 0.728 (ICC vs. cHCC-ICC), respectively (Figures 5A,B). The confusion matrix was shown in Figures 5C,D. In the HCC-vs-non-HCC model, the predicted results showed that of the 160 actual HCC patients, 155 were correctly predicted to be HCC. In the ICC-vs-cHCC-ICC model, the 15 patients with actual cHCC-ICC, 6 were predicted to be cHCC-ICC, and among the 27 actual ICC patients, 22 were correctly predicted to be ICC. These results indicated that the radiomics models can moderately distinguish three different histological types of PLC and performed best at HCC identification. Tables 3, 4 showed the results of univariate and multivariate logistic regression analyses of HCC-vs-non-HCC and ICCvs-cHCC-ICC radiomics models. In the HCC-vs-non-HCC radiomics model, gender, hepatitis, AFP, CA19-9, CEA, stage, and radiomics score were independent factors related to HCC (P < 0.05). In the ICC-vs-cHCC-ICC radiomics model, AFP and radiomics score were independent factors related to cHCC-ICC (P < 0.05).

DISCUSSION
In this research, as far as we know, we are the first to identify an ultrasound-based radiomics models that can be used to predict HCC, ICC, and cHCC-ICC. The radiomics models achieved good diagnostic efficiency in both the training cohort and the test cohort, which is expected to help doctors improve the accuracy of presurgical diagnosis and guide the further treatment of PLC patients.  Values were shown as the number of patients (percentage) unless otherwise explained. Radiomics score data were shown as median (Q1 -Q3). AFP, alpha fetoprotein; CA19-9, carbohydrate antigen 19-9; CEA, carcino-embryonic antigen; VEGF, vascular endothelial growth factor.
Another highlight of this study is that we constructed the optimal model through a variety of combinations of dimensionreduction technologies and classifiers. Shiri et al. found that the performance of machine learning models depends on the type of data or application and that there was no general algorithm or single model (29). Different combinations of feature selection methods and classifiers can provide different results (30)(31)(32). In the current study, we performed different dimension-reducing technologies and machine learning approaches to find the optimal models to predict HCC vs. non-HCC and ICC vs. cHCC-ICC. Therefore, the models that we obtained comprehensively captured the potential of radiomics-based differential diagnosis of PLC in the current clinical medical environment.
In the current clinical practice, physicians preoperatively rely on clinical symptoms, tumor serum markers, and imaging tests to determine the type of PLC patient, but these data can sometimes lead to false diagnoses because they may overlap. In addition, due to high heterogeneity in the proportion and existing forms of the two tumor components, the imaging findings of mixed HCC currently lack performance, and most cases are misdiagnosed as simple HCC or ICC. Preoperative differentiation of PLC subtypes has important clinical significance, as different types are associated with different treatment options and prognosis. Improving the accuracy of initial diagnosis can provide more optimized and active treatment for cHCC-ICC patients (16). In addition, clinical medicine is currently moving toward a trend of precision and personalized medicine. In the precise medical environment, medical imaging as an important diagnostic tool is also rapidly evolving and gradually playing an important role (33). Radiomics, which provides a non-invasive method to assess lesions and performs well in the diagnosis and prediction of tumors, is widely considered to be a step in the evolution of imaging toward a concept of personalized cancer management (34,35).
So far, only a few studies have attempted to identify three different tissue types of PLC by imaging methods, and most previous studies have been based on CT and MR images. Wang et al. previously attempted to use preoperative CT and MR imaging to identify cHCC-ICC with HCC and ICC. The study found that compared with ICC and cHCC-ICC, the incidence of HCC pseudocapsule was significantly higher. Compared with their occurrence in HCC and cHCC-ICC, rim enhancement, abnormal perfusion, capsular retraction, and biliary dilatation were more common in ICC. However, in that study, the number of features obtained from images was small, and imaging features, such as tumor size, were all visible to the naked eye; the approach failed to identify and analyze microscopic image features with potential value for clinical diagnosis (36). Lewis et al. used MR images of 65 liver cancer patients. The tumor characteristics and LI-RADS classification were evaluated by two independent observers. Among the two independent observers, the combined AUC of sex and LI-RADS and apparent diffusion coefficient   not distinguish between ICC and cHCC-ICC (37). Compared with CT/MRI, ultrasound examination has the advantages of simplicity and real-time observation, and it plays a vital role in the diagnosis and treatment of liver tumors. However, no radiomics study has sought to identify HCC, cHCC-ICC, and ICC. In view of this knowledge gap, we used ultrasound images to establish radiomics models to distinguish three different pathological classifications of PLC, and we obtained promising results.
Our results showed that the radiomics models we built have a good overall AUC and could well to accurately predict pure HCC, while obtaining lower accuracy in cHCC-ICC. Our findings are roughly consistent with the results of some previous studies that suggest that identifying cHCC-ICC from PLC remains challenging, possibly due to the greater histological heterogeneity of cHCC-ICC. Wang et al. studied the CT and MR images of 136 patients with PLC and found that the features of capsular retraction, abnormal perfusion, and rim enhancement showed better performance in the identification of HCC and ICC, while the ability to distinguish cHCC-ICC from the other two types of PLC was not significant (36). Many image features such as shape, size, edge, position, and enhancement mode in cHCC-ICC mostly behave like ICC or HCC, creating some difficulties in its diagnosis (38).
We finally used the LASSO and random forest methods for feature selection. LASSO regression is also called L1 regularization of linear regression, which is a popular method In univariate analysis, variables with P < 0.1 were included in multivariate logistic regression analysis. In multivariate analysis, P < 0.05 was considered significant. AFP, alpha fetoprotein; CA19-9, carbohydrate antigen 19-9; CEA, carcino-embryonic antigen; VEGF, vascular endothelial growth factor. * represents P < 0.0001.
used in radiomics researches. The basic idea of LASSO is to minimize the residual sum of squares under the constraint that the sum of the absolute values of the regression coefficients is less than a constant, so as to produce some regression coefficients strictly equal to 0 to get an interpretable model. Essentially, it is a process of seeking a sparse expression of the model (39,40). Random forest is an ensemble learning algorithm based on decision tree analysis and has a good performance in classification and regression. Random forest can also be used as a feature selection technology, and it has been widely used in machine learning, determining the importance of features during model training (28,41,42).
The texture features showed high importance in our prediction model. Image texture is a visual feature that reflects homogeneous phenomena in the image, and it reflects the surface structure organization and arrangement properties of the object with slow or periodic changes. The texture can be layered by the statistical order of the information encoded in the image, which can be divided into first-order texture features, second-order texture features, and high-order texture features (43). Texture features are widely recognized as quantitative biomarkers of tumor heterogeneity (44,45).
The large sample size of our study helped to improve the generality and stability of our results. However, our research also has certain limitations. First, all ultrasound imaging data were from a unitary center, and the study was retrospective in nature. The grayscale ultrasound images used in our study were collected by different commercial ultrasound systems. Although the data extracted from the images were preprocessed, the imaging of different instruments may still have some influence on the results of feature extraction, so whether the model can play a prospective role remains an open question. Therefore, it is necessary to conduct a multicenter prospective study with a rigorous control of ultrasound machines to In univariate analysis, variables with P < 0.1 were included in multivariate logistic regression analysis. In multivariate analysis, P < 0.05 was considered significant. AFP, alpha fetoprotein; CA19-9, carbohydrate antigen 19-9; CEA, carcino-embryonic antigen; VEGF, vascular endothelial growth factor. * represents P < 0.0001.
further explore the diagnostic potential of radiomics-based modeling. Second, our study included only PLC and did not include benign and metastatic tumors of the liver. The identification of more types of tumors is more challenging. We will add data for other types of liver tumors in future studies to optimize the universality and clinical value of the model. Third, we took into account the characteristics of general clinical applications of ultrasound, and this is a retrospective study, so we finally adopted two-dimensional ultrasound images. However, the quantitative features extracted based on two-dimensional ultrasound images cannot stand for the overall lesion, and a more precise radiomics analysis depends on the acquisition of 3D images. Further research on threedimensional ultrasound radiomics is necessary in the future. Fourth, our study focused on the relationship between highthroughput imaging features extracted from tumor ROI and pathological typing. In order to quantify the heterogeneity of tumors more comprehensively, it is necessary to pay more attention to the peritumoral information and combine more clinicopathological information to establish a more accurate individualized disease assessment model. Therefore, in the future, we need to optimize our model based on the above limitations and carry out prospective studies, which may be helpful to improve the discrimination performance of radiomics model for PLC. In summary, we developed and validated the ultrasoundbased radiomics models to distinguish different histopathological types of PLC, thus providing a new approach for doctors to non-invasively identify HCC, cHCC-ICC, and ICC.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusion of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.