A Computed Tomography Nomogram for Assessing the Malignancy Risk of Focal Liver Lesions in Patients With Cirrhosis: A Preliminary Study

Purpose The detection and characterization of focal liver lesions (FLLs) in patients with cirrhosis is challenging. Accurate information about FLLs is key to their management, which can range from conservative methods to surgical excision. We sought to develop a nomogram that incorporates clinical risk factors, blood indicators, and enhanced computed tomography (CT) imaging findings to predict the nature of FLLs in cirrhotic livers. Method A total of 348 surgically confirmed FLLs were included. CT findings and clinical data were assessed. All factors with P < 0.05 in univariate analysis were included in multivariate analysis. ROC analysis was performed, and a nomogram was constructed based on the multivariate logistic regression analysis results. Results The FLLs were either benign (n = 79) or malignant (n = 269). Logistic regression evaluated independent factors that positively affected malignancy. AFP (OR = 10.547), arterial phase hyperenhancement (APHE) (OR = 740.876), washout (OR = 0.028), satellite lesions (OR = 15.164), ascites (OR = 156.241), and nodule-in-nodule architecture (OR =27.401) were independent predictors of malignancy. The combined predictors had excellent performance in differentiating benign and malignant lesions, with an AUC of 0.959, a sensitivity of 95.24%, and a specificity of 87.5% in the training cohort and AUC of 0.981, sensitivity of 94.74%, and specificity of 93.33% in the test cohort. The C-index was 96.80%, and calibration curves showed good agreement between the nomogram predictions and the actual data. Conclusions The nomogram showed excellent discrimination and calibration for malignancy risk prediction, and it may aid in making FLLs treatment decisions.


INTRODUCTION
It is challenging for abdominal radiologists to detect and characterize focal liver lesions (FLLs) in patients with cirrhosis (1)(2)(3). Cirrhosis is a major risk factor for hepatocellular carcinoma (HCC) (4). The diagnosis of benign and malignant FLLs in cirrhotic individuals is important. However, in cirrhotic livers, these lesions may lack typical imaging features (1). Therefore, the final diagnosis may need to be verified by tissue sampling (1,5). Accurate descriptions of FLLs guide their management, which ranges from conservative treatment to surgical excision (6). Proper identification can prevent unnecessary biopsy and allow the appropriate treatment to be selected (5,7).
Computed tomography (CT) is commonly used to diagnose and manage patients with chronic liver disease (8)(9)(10). Dynamic CT has high sensitivity and specificity for diagnosing FLLs (11). The wash-in and washout of contrast agents can assist distinguishing HCC from other FLLs (12). Clinical and laboratorial risk factors, such as age and sex, are also helpful. However, many cases require further imaging or histopathological examination to confirm the diagnosis (5,10). Therefore, it is helpful to develop a scoring system combining clinical information and imaging results to evaluate the malignancy of FLLs.
A nomogram is a graphical statistical tool that combines variables into a continuous scoring system to calculate precise risk probabilities for specific individual outcomes (13)(14)(15). This instrument is an important tool in modern medical decisionmaking in specialties of oncology (16)(17)(18)(19)(20)(21), such as differentiating focal nodular hyperplasia (15) or hepatocellular adenoma (16) from HCC in noncirrhotic patients, prediction of microvascular invasion and liver failure after hepatectomy in patients with in HCC (22)(23)(24)(25) and efficacy evaluation of intrahepatic cholangiocarcinoma after hepatectomy (26). CT-based nomograms may be used to predict the nature of FLLs, aiding clinicians in selecting the best management plan. We sought to develop a nomogram that incorporates clinical risk factors, blood indicators, and enhanced CT imaging findings to predict the nature of FLLs in cirrhotic livers.

Patients
Our institutional review board approved this retrospective single-center study with a waiver for the requirement to obtain informed consent. The subjects had to have liver cirrhosis of any etiology, surgically confirmed after CT scans between January 2017 and December 2020.
The inclusion criteria were as follows: (a) serological markers, such as serum total bilirubin, plasma albumin, prothrombin time, blood platelet count and a-fetoprotein (AFP), measured simultaneously before surgery; (b) confirmation by pathology; (c) no history of preoperative anticancer therapy, including transcatheter arterial chemoembolization (TACE), percutaneous radiofrequency ablation (PRFA), or percutaneous ethanol injection (PEI); and (d) multidetector CT imaging. The exclusion criteria were as follows: (a) no pathological confirmation; (b) undergoing treatment before imaging or surgery; (c) incomplete serological markers before surgery; and (d) poor image quality. For patients with multiple lesions, we analyzed the largest lesion. A total of 348 surgically confirmed FLLs in 348 patients were included in the study. A total of 295 patients diagnosed from 2017 to June 2020 were included in a training cohort, and 53 patients diagnosed from July 2020 to December 2020 were included in a test cohort.
Age, sex, and basic patient information were collected. Routine examinations included serum total bilirubin, total plasma protein, prothrombin time, blood platelet counts, tumor marker AFP levels, and hepatitis (B and C) results. The Child-Pugh classification was determined for each patient based on the above variables. AFP > 20.0 ng/ml was the threshold for positivity (27).

CT Technique
The area from the diaphragm to the pubic symphysis was examined on a multidetector CT scanner (Aquilion 16, Toshiba Medical Systems Corporation; Tochigi ken, Japan; Brilliance 64, Philips, Netherlands; Aquilion ONE TSX-301A, Toshiba Medical Systems Corporation; Tochigi ken, Japan) with plain and dynamic contrast-enhanced scans as follows: tube voltage of 120 kVp, tube current of 200 mA, slice thickness of 5 mm, and rotation time of 0.5 seconds. The helical pitch was 0.9, the field of view was 35 to 40 cm, the matrix was 512 × 512, and a standard reconstruction algorithm was used. After plain CT scan, the patients received 80-100 mL of contrast agent (iodipamide, 370 mg I/mL, Bracco) at a rate of 3.5-4.0 mL/s, followed by 20 mL of saline solution through the elbow vein using a power injector. Scans in the arterial phase (AP, 35 seconds), portal venous phase (PVP, 70 seconds), and equilibrium phase (EP, 3 minutes) were obtained.

CT Imaging Analysis
Two radiologists(7 and 13 years of abdominal diagnostic experience) independently evaluated the CT images from the Picture Archiving and Communication System. They knew the purpose and design of the study, but they were unaware of the patient demographics, clinical history, clinical reports, and reference criteria. For each lesion, the readers assessed the presence of imaging features mainly based on Liver Imaging Reporting and Data System (LI-RADS) version 2018 (28,29), some features reported in the literature (30,31) or commonly used in reports. The included features were as follows: tumor size (maximum (Dmax) cross-section diameter), non-rim arterial phase hyperenhancement (APHE), non-peripheral washout in the PVP (washout), enhancing capsule in either the portal venous or delay phases (appearance detected as enhancing rim), blood product in mass (bleeding within or around the lesion without surgery, trauma or intervention), fat (excess fat in the whole or part of the mass relative to the background liver), necrosis (areas within the tumor without obvious enhancement), infiltrative appearance (invasion), mural nodules (peripheral nodules within the lesion attached to the tumor wall), satellite lesions (nodules in the surrounding parenchyma resemble the main lesion), halo enhancement(solar enhancement in parenchyma around the lesion), peritumoral enhance(rim arterial phase hyperenhancement), vein tumor thrombus (VTT, definite enhancement of soft tissue in the portal vein), delayed enhancement(progressive enhancement in the center of the lesion), internal artery (small vessels in the arterial stage), nonenhancing "capsule" (capsule appearance not detected as enhancing rim), mosaic architecture(random distribution of internal nodules or compartments, often with different radiographic features), nodulein-nodule architecture (the internal nodules were small and larger than the external nodules, with different imaging features), corona enhancement, lymph node enlargement (short diameter >10mm) and ascites (perihepatic water density).

Statistical Analysis
Statistical analysis was performed using IBM SPSS Statistics version 25 (SPSS Inc., Chicago, IL, USA) and R version 3.3.4 (www.Rproject.org). The Mann-Whitney U test was used for continuous variables, and the c 2 or Fisher exact test was used for categorical variables. To test the consistency of the two readers, a Kendall correlation coefficient was used to measure the index, and a kappa consistency test was used for the counting index. The k values were considered poor for a k of 0.01 to 0.20; fair for a k of 0.21 to 0.40; moderate for a k of 0.41 to 0.60; good for a k of 0.61 to 0.80; and excellent for a k of 0.81 to 1.00. Univariate analysis was used to compare the differences in clinical factors (all independent clinical risk factors, blood markers, and CT findings) between the two cohorts, and multivariate logistic regression analysis was used to establish a clinical factor model, with significant variables in the univariate analysis as the input. The odds ratio (OR) was used as a relative risk estimate for each risk factor and is presented with its corresponding 95% confidence interval (CI). After establishing the combined predictor, receiver operating characteristic (ROC) analysis was performed to calculate the area under the curve (AUC), sensitivity and specificity. A nomogram was developed by scaling the regression coefficients into a multiple logistic regression of 0-100 points. Important malignancy factors in the multivariate analysis were included in the nomograms. The total score is the sum of the points for each independent variable and is converted to the prediction probability. The nomogram's performance was measured by the consistency index (C-index) and calibrated with 1,000 bootstrap samples to reduce overfitting bias (32). A calibration curve was plotted to evaluate the actual observations vs the nomogram predictions of the benignity or malignancy of lesions. Decision curve analysis (DCA) showed that the net clinical benefit was correlated with the diagnostic procedure including the established nomogram (33). All statistical tests were two-sided, and a P-value < 0.05 was considered statistically significant.

Interobserver Agreement
The indexes of lesion size (Dmax) had good consistency among observers (P > 0.05). The consistency value of the counting indexes among observers was greater than 0.75, indicating good consistency between observers.

Malignancy Risk and the Prediction Nomogram
Based on the independent factors, we established a nomogram of the corresponding scoring system using RMS package in R as shown in   was considered malignant. With a 50% cutoff point, a score of more than 80 points indicated malignant FLLs with a C-index of 96.80%. Moreover, calibration curves showed good agreement between the nomogram predictions and the actual data ( Figures 4A, B). The DCA results are shown in Figures 5A, B.

DISCUSSION
In this study, we established a precise nomogram based on CT imaging findings for predicting the malignancy of FLLs. The results indicate a good identification effect.
Triphasic CT scans are effective tools for differentiating benign and malignant FLLs, as the sensitivity, specificity, positive predictive value, negative predictive value, and diagnostic accuracy were 100%, 80%, 94.5%, 100% and 95.5%, respectively (34). Diffusion weighted imaging (DWI) may provide additional information for the differentiation of HCC from nodules with abnormal hyperplasia by detecting the movement of freely diffusible water molecules. DWI model has a reference value for describing FLLs, the distributed diffusion coefficient, which shows good diagnostic performance (35).
In the univariate analysis of independent predictors of malignancy, most factors showed significant differences between the benign and malignant cohorts. Male patients with higher serum total bilirubin, higher AFP levels and lower blood platelet counts may be prone to malignant lesions. In CT, the presence of the following features indicates a high possibility of malignant lesions: APHE, washout, enhancing capsule, blood product in mass, necrosis, infiltrative appearance, mural nodules, satellite lesions, VTT, internal artery, ascites, non-enhancing capsule, mosaic architecture, and nodule-in-nodule architecture. Multivariable factors including AFP, APHE, washout, satellite lesions, ascites, and nodule-in-nodule architecture, were independent predictors of malignancy. A previous study reported that capsular enhancement is an important imaging biomarker for predicting high-grade HCC, and non-enhancing capsule is not significantly associated with high-grade HCC (36). However, our study showed that in our cohort, capsular enhancement was not an independent predictor of malignancy. The reason may be that some of the benign lesions such as abscesses, also showed capsular enhancement. APHE and washout are the main features of HCC, while nodule-in-nodule architecture is an auxiliary feature of HCC (37,38). Nodule-innodule architecture, defined as a small nodule within the lesion, is an independent predictor of microvascular invasion (MVI) of HCC (36) and is proven to be an independent predictor of malignancy in our study. HCC accounts for the majority of malignant lesions, and these three features are independent predictors for the identification of benign and malignant FLLs in this study. A diagnostic model including AFP, sex, age and prothrombin time (ASAP model) has been shown to accurately predict the development of HCC in patients at high risk of hepatitis B virus. The ASAP model performed well in both the    test and validation groups (39). In our study, with the exception of AFP, sex, serum total bilirubin, blood platelet counts, and pathogenesis were not independent predictors of malignancy, although there were statistically significant differences in these factors between the benign and malignant cohorts. The order of AUC values suggested that the combined predictors were the strongest in the diagnosis of malignant lesions (sensitivity = 95.24%, specificity = 87.50%). This model also had high diagnostic sensitivity (94.74%) and specificity (93.33%) in the test group. We established a nomogram with a corresponding scoring system. With a cutoff point of 50%, it can accurately determine if an FLL is malignant. An example of image scoring is illustrated ( Figures 6A-C).This lesion demonstrates no arterial phase hyperenhancement (APHE,0 score points, Figure 6A), presence of washout (minus 41 score points, Figure 6B), without satellite lesions (0 score points), ascites (68 score points, Figure 6C) and nodule-in-nodule architecture (54 score points, Figure 6C), with AFP>20.0 ng/ml (51 score points). The nomogram equation therefore would be as follows: 51 (AFP) + 100 (APHE) -41 (washout) + 30 (satellite lesions) + 68 (ascites) + 54 (nodule-in-nodule architecture) =132, indication malignancy according to the nomogram score > 80 points. Washout is one of the main features in the diagnosis of HCC, but in our study, it was negatively associated with benign and malignant tumors. One of the possible reasons is that some benign lesions, such as adenomas, can show washout, while some HCC lesions lack washout. In addition, washout is a purely visual criterion, which may result in observer-dependent bias. According to the literature reports, quantitative washout assessment in LI-RADS has the opposite effect in HCC diagnosis and needs to be redefined (40).
In this study, the C-index (96.80%) and calibration curve demonstrated that our nomogram was accurate in predicting the malignancy of lesions. However, there are some limitations to our model. First, this was a small, single-center, retrospective study that lacked a external validation group, which may alter the scoring system's efficiency. Therefore, multicenter and large-scale studies are necessary to improve the scoring system, and a prospective study is needed to confirm its reliability. Second, cirrhosis is likely to increase the possibility of HCC, which may cause selective bias. Third, rare lesions of liver were really small in our study. Assessing more data can make the model more generalizable.

CONCLUSION
Based on AFP and CT findings including APHE, washout, satellite lesions, ascites, and nodule-in-nodule architecture, we developed an objective scoring system to predict the risk of malignancy. This model may aid in making informed treatment decisions for FLLs. A large-scale, prospective validation study is needed to assess the broad applicability of the nomogram.