The Roles of Diffusion Kurtosis Imaging and Intravoxel Incoherent Motion Diffusion-Weighted Imaging Parameters in Preoperative Evaluation of Pathological Grades and Microvascular Invasion in Hepatocellular Carcinoma

Background Currently, there are disputes about the parameters of diffusion kurtosis imaging (DKI), intravoxel incoherent motion (IVIM), and diffusion-weighted imaging (DWI) in predicting pathological grades and microvascular invasion (MVI) in hepatocellular carcinoma (HCC). The aim of our study was to investigate and compare the predictive power of DKI and IVIM-DWI parameters for preoperative evaluation of pathological grades and MVI in HCC. Methods PubMed, Web of Science, and Embase databases were searched for relevant studies published from inception to October 2021. Review Manager 5.3 was used to summarize standardized mean differences (SMDs) of mean kurtosis (MK), mean diffusivity (MD), tissue diffusivity (D), pseudo diffusivity (D*), perfusion fraction (f), mean apparent diffusion coefficient (ADCmean), and minimum apparent diffusion coefficient (ADCmin). Stata12.0 was used to pool the sensitivity, specificity, and area under the curve (AUC). Overall, 42 up-to-standard studies with 3,807 cases of HCC were included in the meta-analysis. Results The SMDs of ADCmean, ADCmin, and D values, but not those of D* and f values, significantly differed between well, moderately, and poorly differentiated HCC (P < 0.01). The sensitivity, specificity, and AUC of the MK, D, ADCmean, and ADCmin for preoperative prediction of poorly differentiated HCC were 69%/94%/0.89, 87%/80%/0.89, 82%/75%/0.86, and 83%/64%/0.81, respectively. In addition, the sensitivity, specificity, and AUC of the D and ADCmean for preoperative prediction of well-differentiated HCC were 87%/83%/0.92 and 82%/88%/0.90, respectively. The SMDs of ADCmean, ADCmin, D, MD, and MK values, but not f values, showed significant differences (P < 0.01) between MVI-positive (MVI+) and MVI-negative (MVI-) HCC. The sensitivity and specificity of D and ADCmean for preoperative prediction of MVI+ were 80%/80% and 74%/71%, respectively; the AUC of the D (0.87) was significantly higher than that of ADCmean (0.78) (Z = −2.208, P = 0.027). Sensitivity analysis showed that the results of the above parameters were stable and reliable, and subgroup analysis confirmed a good prediction effect. Conclusion DKI parameters (MD and MK) and IVIM-DWI parameters (D value, ADCmean, and ADCmin) can be used as a noninvasive and simple preoperative examination method to predict the grade and MVI in HCC. Compared with ADCmean and ADCmin, MD and D values have higher diagnostic efficacy in predicting the grades of HCC, and D value has superior diagnostic efficacy to ADCmean in predicting MVI+ in HCC. However, f value cannot predict the grade or MVI in HCC.

sensitivity and specificity of D and ADCmean for preoperative prediction of MVI+ were 80%/80% and 74%/71%, respectively; the AUC of the D (0.87) was significantly higher than that of ADCmean (0.78) (Z = −2.208, P = 0.027). Sensitivity analysis showed that the results of the above parameters were stable and reliable, and subgroup analysis confirmed a good prediction effect.

INTRODUCTION
Hepatocellular carcinoma (HCC) is the most common malignant tumor in the world and also one of the main causes of cancerrelated death (1). Considering the specific pathogenic mechanism and epidemiological and pathological basis of the occurrence and development of HCC, early diagnosis of HCC is difficult (2). Previous studies (3,4) have indicated that the pathological grade of HCC is closely related to patients' prognosis; specifically, the postoperative survival rate of patients with well-and moderately differentiated HCC is significantly higher than that of patients with poorly differentiated HCC, and the 5-year postoperative recurrence rate of poorly differentiated HCC is as high as 70%. Similarly, several studies (5-7) have suggested that microvascular invasion (MVI) is an independent risk factor for recurrence and metastasis of HCC after treatment and is the most characteristic malignant biological behavior of HCC. Moreover, the postoperative recurrence rate of MVI-positive (MVI+) patients is 4.4 times higher than that of MVI-negative (MVI-) patients (8). For patients with MVI, a larger surgical resection range or ablation zone has to be employed in combination with systemic adjuvant therapy (9).
However, determination of the pathological grade and MVI of HCC mainly depends on postoperative pathological diagnosis, so there is a certain time lag. Therefore, it is extremely important to explore a noninvasive preoperative examination method to predict the pathological grade and MVI in patients with HCC. In recent years, a number of studies  have suggested that diffusion kurtosis imaging (DKI) parameters of mean kurtosis (MK) and mean diffusivity (MD) and intravoxel incoherent motion diffusion-weighted imaging (IVIM-DWI) parameters of tissue diffusivity (D), pseudo diffusivity (D*), perfusion fraction (f), mean apparent diffusion coefficient (ADCmean), and minimum apparent diffusion coefficient (ADCmin) could be used for preoperative prediction of the pathological grade or MVI in individuals with HCC. However, there are still differences and controversies as to whether these parameters can distinguish the HCC pathological grade or MVI before surgery; moreover, the preoperative prediction efficacy in previous studies was different, with large differences in each effective index and small sample size.
Therefore, the aim of our meta-analysis was to comprehensively investigate whether DKI or IVIM-DWI parameters could predict the pathological grade or MVI in patients with HCC and to compare the predictive power of these parameters for the diagnosis of pathological grades and MVI+ in individuals with HCC.

Inclusion and Exclusion Criteria
The inclusion criteria were as follows: (a) evaluation of the diagnostic performance of DKI or IVIM or DWI for determining the presence of MVI or tumor grading in individuals with HCC using the MD and/or MK and/or D and/or D* and/or f and/or ADCmean and/or ADCmin parameters; (b) total sample not less than 20 cases; (c) available information regarding the mean/standard deviation or sensitivity/specificity of parameters for diagnosis of HCC grade or MVI; (d) the Edmondson-Steiner (ES) grade of one indicated well differentiated HCC (wdHCC), the ES grade of two indicated moderately differentiated HCC (mdHCC), and the ES grade greater than or equal to three indicated poorly differentiated HCC (pdHCC) (52). Duplicate articles, review articles, experimental animal studies, and case reports, as well as non-English publications, were excluded.

Data Extraction
The study complied with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). The retrieved literature was imported into EndNote X9 (Thomas Reuters, New York, NY, USA). After removing the duplicates, FW, CYY, and CHW extracted the basic characteristics and diagnostic parameters of the included articles in strict accordance with the inclusion and exclusion criteria, and the obtained data were reviewed three times.

Quality Assessment
The Review Manager 5.3 software (The Cochrane Collaboration, 2014) was used to evaluate the quality of the studies, referring to the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) (53). CYY and CHW independently evaluated the risk of bias and the clinical applicability of the studies in terms of patient selection, index tests, reference standards, and flow and timing. When there was a difference in opinions, the two investigators discussed the issue and reached a consensus.

Statistical Processing
The meta-analysis was conducted using Review Manager 5.3 and Stata version 12.0 (StataCorp, College Station, TX, USA). First of all, heterogeneity was determined by means of the inconsistency index I 2 (54,55). A random-effects model was used when the I 2 was above 50% or P was <0.05, which indicated high heterogeneity between the studies; otherwise, a fixed-effects model was applied. Second, Egger's test or Begg's test was used to visually and quantitatively assess the publication bias for the continuous variables, whereas Deek's test was used to assess the publication bias of the diagnostic study. Finally, Review Manager 5.3 was used to summarize the standardized mean difference (SMD) and 95% confidence intervals (CIs) of the parameters, and Stata12.0 was used to pool the sensitivity, specificity, and area under the curve (AUC). The sensitivity analysis and subgroup analysis were used to explore the source of heterogeneity.

Basic Characteristics of the Study
Finally, 42 up-to-standard studies (10-51) with 3,807 cases of HCC were included. There were 27 studies on grading (2,172 HCCs), 11 studies on MVI (1,220 HCCs), and four studies on grading and MVI (415 HCCs). The literature screening process is shown in Figure 1. The basic characteristics of the included studies are shown in Table 1, and some parameters of diagnostic studies are shown in Supplementary Table S1. Figure 2 shows the quality assessment based on the QUADAS-2 scale. The overall quality of the studies was acceptable. In the patient selection domain, there was an unclear risk of bias in 18 studies because the inclusion and exclusion criteria had not been clearly reported. Eleven studies had an unclear concern, and one study had a high concern due to different inspection methods. In the index test domain, there was an unclear risk of bias in 18 studies because the information about blinding test had not been provided. Similarly, 23 studies had no information about blinding to the index test in the reference standard domain. Meanwhile, three studies had a high risk of bias in the flow and timing domain.

Role of the Mean Apparent Diffusion Coefficient in the Evaluation of Grade/Microvascular Invasion in Hepatocellular Carcinoma
In 26 studies (n = 2,504), ADCmean was used to distinguish between HCC grades. There was high heterogeneity (I 2 > 75%), so we used the random-effects model. As shown in the forest plot in Figures 3A-C, ADCmean positively correlated with the differentiation degree of HCC (P < 0.05). Egger's test suggested no publication bias (P = 0.238, P = 0.777, P = 0.699). Similarly, 15 studies (n = 1,752) reported that ADCmean was used for detecting MVI. There was no significant heterogeneity (I 2 = 45%), so the fixed-effects model was used. Figure 3D shows that ADCmean of MVI-HCC was significantly higher than that of MVI+ HCC (P < 0.01). Egger's test suggested no publication bias (P = 0.958).

Role of the Minimum Apparent Diffusion Coefficient in the Evaluation of Grade/Microvascular Invasion in Hepatocellular Carcinoma
In five studies (n = 586), ADCmin was used for distinguishing grades. The studies (wdHCC vs. mdHCC, wdHCC vs. pdHCC) showed no significant heterogeneity (I 2 = 0%), and the fixedeffects model was used. In contrast, the studies of mdHCC vs. pdHCC showed high heterogeneity (I 2 = 53%), so the randomeffects model was applied. As shown in Figures 4A-C, the ADCmin positively correlated with the differentiation degree of HCC (P < 0.01). Egger's test suggested no publication bias (P = 0.981, P = 0.644, P = 0.614). Similarly, four studies (n = 672) reported that ADCmin was used for distinguishing MVI. These four studies had high heterogeneity (I 2 = 79%), and the random-effects model was used. Figure 4D indicates that the ADCmin of MVI-HCC was significantly higher than that of MVI+ HCC (P < 0.01). Egger's test suggested no publication bias (P = 0.699).

Role of the Tissue Diffusivity Values in the Evaluation of Grade/Microvascular Invasion in Hepatocellular Carcinoma
In seven studies (n = 711), D was used for distinguishing grades. The studies had high heterogeneity (I 2 > 75%), and the randomeffects model was used. Figures 5A-C show that D positively correlated with the differentiation degree of HCC (P < 0.05). Egger's test (wdHCC vs. mdHCC, wdHCC vs. pdHCC) suggested no publication bias (P = 0.389, P = 0.232), and the Begg's test of mdHCC vs. pdHCC suggested no publication bias (P = 0.283). Four studies (n = 672) reported that D was used for distinguishing MVI; they did not show significant heterogeneity (I 2 = 22%), so the fixed-effects model was used. As shown in Figure 5D, D value of MVI-HCC was significantly higher than that of MVI+ HCC (P < 0.01). Egger's test suggested no publication bias (P = 0.652).

Role of the Pseudo Diffusivity Values in the Evaluation of Grade/Microvascular Invasion in Hepatocellular Carcinoma
In six studies (n = 593), D* was used for distinguishing grades. The studies (wdHCC vs. mdHCC, wdHCC vs. pdHCC) had no significant heterogeneity (I 2 < 50%), so the fixed-effects model was used. The studies of mdHCC vs. pdHCC showed high heterogeneity (I 2 = 65%), so the random-effects model was applied. As shown in Figures 6A-C, there was no significant difference for pathology grading in HCC (P > 0.05). Egger's test suggested no publication bias (P = 0.510, P = 0.325, P = 0.062). Three studies (n = 227) reported that D* was used for distinguishing MVI; there was no significant heterogeneity (I 2 = 0%), so we used the fixed-effects model. Figure 6D shows that D* of MVI-HCC was higher than that of MVI+ HCC (P < 0.05). Egger's test suggested no publication bias (P = 0.560).

Role of the Perfusion Fraction Values in the Evaluation of Grade/Microvascular Invasion in Hepatocellular Carcinoma
In six studies (n = 593), f was used for distinguishing grades. The studies had high heterogeneity (I 2 > 75%), so we used the random-effects model. As shown in Figures 7A-C, there was no significant difference for pathology grading in HCC (P > 0.05). Egger's test suggested no publication bias (P = 0.713, P = 0.100, P = 0.967). Three studies (n = 227) reported that f was used for distinguishing MVI. They had no significant heterogeneity (I 2 = 0%), so the fixed-effects model was used. As shown in Figure 7D, f did not distinguish MVI+ HCC from

Role of the Mean Diffusivity Values in the Evaluation of Grade/Microvascular Invasion in Hepatocellular Carcinoma
In three studies (n = 388), MD was used for distinguishing grades. There was no significant heterogeneity (I 2 = 0%), so we used the fixed-effects model. Figure 8A shows that the MD value of pdHCC was significantly lower than that of non-pdHCC (P < 0.01). Egger's test suggested no publication bias (P = 0.582). Two studies (n = 258) reported that MD was used for distinguishing MVI; they did not show significant heterogeneity (I 2 = 0%), and the fixed-effects model was used. Figure 8B shows that the MD of MVI-HCC was significantly higher than that of MVI+ HCC (P < 0.01). Egger's test suggested no publication bias (P = 0.870).

Role of the Mean Kurtosis Values in the Evaluation of Grade/Microvascular Invasion in Hepatocellular Carcinoma
In three studies (n = 388), the MK was used for distinguishing grades. There was highly significant heterogeneity (I 2 > 75%), so we used the random-effects model. Figure 9A shows that the MK value of non-pdHCC was significantly lower than that of pdHCC (P < 0.01). Begg's test suggested no publication bias (P = 0.308).
Two studies (n = 258) reported that the MK was used to distinguish MVI. These studies did not show significant heterogeneity (I 2 = 0%), so the fixed-effects model was used. Figure 9B shows that the MK of MVI-HCC was significantly lower than that of MVI+ HCC (P < 0.01). Egger's test suggested no publication bias (P = 0.179).

Sensitivity Analysis of the Parameters for Distinguishing Microvascular Invasion in Hepatocellular Carcinoma
First, the SMDs of each parameter for distinguishing MVI changed little after the combination of transformation random-effects model and fixed-effects model. Moreover, after excluding each study one by one, the results of the sensitivity analysis ( Supplementary Figures S1A-G) suggested that the studies of ADCmean, D value, D* value, f value, MD value, and MK value, but not ADCmin value, were stable and reliable to identify MVI-HCC vs. MVI+ HCC. After removing the study by Kim et al. (37), the result of ADCmin in discriminating MVI-vs. MVI+ HCC was stable and reliable (SMD = 0.87, P < 0.00001, Supplementary Figure S2). The I 2 decreased from 79% to 1%, which suggested that the excluded study was likely the source of heterogeneity.

Sensitivity Analysis of the Parameters for Distinguishing Grades in Hepatocellular Carcinoma
After excluding each study one by one, the results of the sensitivity analysis ( Supplementary Figures S3A-C-S7A-C) suggested that   Figures  S9, S10). The heterogeneity was lower than before, which suggested that these studies were likely the source of heterogeneity.

Diagnostic Performance
The pooled sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR), and  Table 3. Interestingly, after grouping by subgroup (study design, sample size, machine type, number of b value, and maximum b value), the heterogeneity of the sensitivity and specificity decreased to varying degrees, suggesting that the subgroup might have been the source of heterogeneity. In addition, after grouping by maximum b value (≤800) and sample size (≤90), the AUC of the ADCmean for the diagnosis of pdHCC increased from0.86 to 0.93, and the AUC of the MVI+ HCC increased from 0.78 to 0.81. Overall, each subgroup analysis had a good prediction effect.

DISCUSSION
Hepatectomy and liver transplantation are currently the preferred treatment methods for HCC. Due to the invasive nature of surgery and the limited availability of organ transplantation, it is extremely important to determine the possibility of postoperative recovery and recurrence rate in patients before surgery. The HCC pathological grade and MVI are independent risk factors for recurrence and metastasis after hepatectomy or liver transplantation (56,57). Therefore, preoperative prediction of pathological grade or MVI in HCC is crucial. The DKI is based on the non-Gaussian distribution model, which can better and more accurately reflect the subtle changes of tissue microstructure (58). IVIM adopts a multi-b-value scan and double exponential model fitting, which can more accurately reflect the diffusion of water molecules in tissues and microvascular blood perfusion, thereby better reflecting the heterogeneity of tumors (59). However, there are controversies as   to whether the parameters of DKI and IVIM-DWI can be employed in the preoperative distinguishing of pathological grades and MVI in individuals with HCC. Therefore, 42 original studies were strictly included in this analysis to expand the sample size, and they were objectively and comprehensively evaluated to determine the diagnostic value of the DKI and IVIM-DWI parameters. Based on SMDs, we showed that there were significant differences in the MK, MD, D, ADCmean, and ADCmin for preoperative prediction of the pathological grade or MVI in individuals with HCC. The D, ADCmean, and ADCmin positively correlated with the degree of differentiation of HCC. However, these findings are inconsistent with the conclusion of the meta-analysis by Surov et al. (60) that the ADCmean could not predict pathological grade and MVI in HCC. The reason may be that we included new studies (33-35, 38, 49, 50) and expanded the sample size. Moreover, various combination methods contributed to the differences. Surov et al. (60) combined the means of grades 1, 2, and 3 and MVI+/-and then compared whether there was an overlap between the combined means. In contrast, the SMDs were used as the effective index to distinguish well-, moderately, and poorly differentiated HCC and MVI+/-in our study. Similarly, the MK and MD could be used for preoperative distinguishing between pdHCC and non-pdHCC and between MVI+ and MVI-, with significant differences. The SMDs and 95% CIs were significantly away from the 0 reference line, which suggested that the MK and MD values were of great value in the identification of grades/MVI in HCC. The MK and MD values were the most representative parameters in DKI, which were able to reflect the complexity of tumor tissue microstructure and had potential correlation with tumor invasive biological behavior (38). Compared with non-pdHCC, pdHCC had greater heteromorphism, and the proliferation capacity of cancer tissues was more vigorous, which led to complex tissue structure and non-Gaussian distribution of the water molecule movement, thereby resulting in a higher MK value and a lower MD value. Interestingly, some studies (22,25,34) have suggested that the D* or f values could predict HCC pathological grades, while other studies (18,35,39) did not confirm such conclusions. Our study suggested that there was no significant benefit of D* or f values in predicting HCC pathological grades. The reason may be that the D* value is mainly related to microcirculation blood flow velocity; thus, this can lead to inaccurate measurements under subjective dynamics. In addition, the D* value could not truthfully reflect the real value of cancer focus because the D* value is easily affected by the changes of machine signal and noise. Similarly, the f value indicates the microcirculation perfusion fraction, and the repeatability of measurement is poor because the microcirculation blood flow is dynamic at all times.
Importantly, our study suggested that the MK, D, ADCmean, and ADCmin had a higher diagnostic efficacy to predict pdHCC. Compared with the ADCmean and ADCmin, the D value had higher sensitivity, specificity, and AUC. Similarly, the AUC of the pdHCC predicted by the MK value was 0.89, which was higher than that predicted by the ADCmean and ADCmin, and the specificity was as high as 94%. The reason might be that the MK value is based on a non-Gaussian model; thus, it could reflect the diffusion characteristics of water molecules in vivo as a whole and could more truly reflect the movement state of water molecules in the lesion. Compared with the meta-analysis of Yang et al. (52), our study latest suggested that the D value had excellent diagnostic efficacy in predicting wdHCC, with a sensitivity of 87%, specificity of 83%, and AUC of 0.92; moreover, our study subdivided the ADC value into the mean and minimum ADC value on the basis of expanding the sample size, thereby making the combined results more reliable.
Furthermore, compared with the ADCmean, our study suggested that the D value had higher sensitivity (80%) and specificity (80%) in predicting MVI+ HCC, and the summary AUC of the D value was significantly higher than that of the ADCmean (Z = −2.208, P = 0.027), indicating that the D value was better and more sensitive in predicting MVI+ HCC. The reason might be that the ADC value ignores the influence of microcirculation perfusion in the cancer focus; thus, the D value is more realistic than the ADC value, given that the D value distinguishes the diffusion of pure water molecules and microcirculation perfusion in the tissue by changes in the b value (61).
Our study comprehensively and systematically evaluated the power of the DKI, IVIM, and DWI parameters for preoperative prediction of the pathological grade and MVI in HCC. The quality of the included studies was acceptable, and there was no publication bias in the studies according to Egger's or Begg's test. Moreover, we performed the subgroup analysis of the ADCmean value for the diagnosis of MVI+ HCC and pdHCC. Interestingly, after grouping by maximum b value (≤800) and sample size (≤90), the AUC of the ADCmean for the diagnosis of pdHCC increased from 0.86 to 0.93, and the AUC of the MVI+ HCC increased from 0.78 to 0.81. Overall, each subgroup analysis had a good prediction effect. However, our study had some limitations. First, most studies were retrospective studies, which increased the risk of confusion bias to a certain extent. Second, the sample size of the MK, MD, D*, and f values was not large enough. Therefore, further studies with a larger sample size and of prospective nature are needed to prove our results. Finally, most studies were conducted in Asia, which introduced a certain regional bias.

CONCLUSION
Our meta-analysis showed that the DKI parameters (MD and MK) and the IVIM-DWI parameters (D value, ADCmean, and ADCmin) can be used as a noninvasive and simple preoperative examination method to predict the pathological grade and MVI in HCC. Compared with the ADCmean and ADCmin, the MD and D values showed a higher diagnostic efficacy in predicting the grades of HCC, and the D value had superior diagnostic efficacy to the ADCmean in predicting MVI+ in HCC. However, f values cannot be used as an effective parameter to predict the grades and MVI in HCC. It is quite helpful when making a clinical treatment plan, preoperative prognosis evaluation, and follow-up research.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/ Supplementary Material.

ACKNOWLEDGMENTS
We thank LetPub (www.letpub.com) for its linguistic assistance during the preparation of this article.