Intravoxel Incoherent Motion Diffusion-Weighted Imaging for Quantitative Differentiation of Breast Tumors: A Meta-Analysis

Objectives: The diagnostic performance of intravoxel incoherent motion diffusion–weighted imaging (IVIM-DWI) in the differential diagnosis of breast tumors remains debatable among published studies. Therefore, this meta-analysis aimed to pool relevant evidence regarding the diagnostic performance of IVIM-DWI in the differential diagnosis of breast tumors. Methods: Studies on the differential diagnosis of breast lesions using IVIM-DWI were systemically searched in the PubMed, Embase and Web of Science databases in recent 10 years. The standardized mean difference (SMD) and 95% confidence intervals of the apparent diffusion coefficient (ADC), tissue diffusivity (D), pseudodiffusivity (D*), and perfusion fraction (f) were calculated using Review Manager 5.3, and Stata 12.0 was used to pool the sensitivity, specificity, and area under the curve (AUC), as well as assess publication bias and heterogeneity. Fagan's nomogram was used to predict the posttest probabilities. Results: Sixteen studies comprising 1,355 malignant and 362 benign breast lesions were included. Most of these studies showed a low to unclear risk of bias and low concerns regarding applicability. Breast cancer had significant lower ADC (SMD = −1.38, P < 0.001) and D values (SMD = −1.50, P < 0.001), and higher f value (SMD = 0.89, P = 0.001) than benign lesions, except D* value (SMD = −0.30, P = 0.20). Invasive ductal carcinoma showed lower ADC (SMD = 1.34, P = 0.01) and D values (SMD = 1.04, P = 0.001) than ductal carcinoma in situ. D value demonstrated the best diagnostic performance (sensitivity = 86%, specificity = 86%, AUC = 0.91) and highest post-test probability (61, 48, 46, and 34% for D, ADC, f, and D* values) in the differential diagnosis of breast tumors, followed by ADC (sensitivity = 76%, specificity = 79%, AUC = 0.85), f (sensitivity = 80%, specificity = 76%, AUC = 0.85) and D* values (sensitivity = 84%, specificity = 59%, AUC = 0.71). Conclusion: IVIM-DWI parameters are adequate and superior to the ADC in the differentiation of breast tumors. ADC and D values can further differentiate invasive ductal carcinoma from ductal carcinoma in situ. IVIM-DWI is also superior in identifying lymph node metastasis, histologic grade, and hormone receptors, and HER2 and Ki-67 status.


INTRODUCTION
Breast cancer is one of the most common malignant tumors and the second leading cause of cancer death in females (1). Early detection and accurate diagnosis of breast cancer with various histological/molecular subtypes, such as estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2), and Ki-67 proliferation indexes, are helpful for developing individualized therapies and achieving a better prognosis. Screening the breast lesions with conventional mammography is challenging for clinician due to the low sensitivity in dense breast parenchyma (2). Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is a common MRI sequence in clinical practice, which can reflect the morphological and haemodynamic features of breast lesions. A previous meta-analysis which included studies using DCE-MRI as an adjunct to conventional mammography or ultrasound to clarify uncertain finding without microcalcification, demonstrated that breast MRI had an excellent diagnostic performance with a pooled sensitivity of 99% and specificity of 89% (3). However, the specificity is still variable due to background parenchymal enhancement and overlapped kinetic enhancement patterns between breast cancer and benign lesion. The false-positive findings may cause additional examination or unnecessary surgery (4).
Diffusion-weighted imaging (DWI) has become a promising technique in the differential diagnosis of breast lesions, which allows measurement of water molecular movement using apparent diffusion coefficient (ADC) values. The international European Society of Breast Imaging (EUSOBI) working group has confirmed the importance of breast DWI in the multiparametric breast MRI protocol to differentiate between breast cancer and benign lesions, distinguish in situ from invasive lesions, and predict the responses to neoadjuvant therapy over time (5). Breast cancer usually has high cellularity (low diffusivity) and high vascularity (high perfusion), which may impact ADC values in a diametrically opposite direction.
Intravoxel incoherent motion (IVIM) is an advanced imaging technique that was first proposed by Le Bihan et al. (6). This procedure can distinguish the incoherent motion of water molecules within the capillaries from molecular diffusion in the extravascular space (7). The true diffusion coefficient (D value), Abbreviations: AUC, area under the curve; ADC, apparent diffusion coefficient; D, tissue diffusivity, D * , pseudodiffusivity; ER, estrogen receptor; f, perfusion fraction; HER2, human epidermal growth factor receptor 2; IVIM-DWI, intravoxel incoherent motion-diffusion-weighted imaging; SMD, standardized mean difference; I 2 , inconsistency index; PR, progesterone receptor. pseudodiffusion coefficient (D * value) and perfusion fraction (f value) were generated using a biexponential model with multiple b-values (8). Several studies have applied IVIM-DWI to discriminate breast cancer from benign breast lesions and characterize the histological/molecular subtypes of breast cancer better diagnostic performance than traditional ADC values (7,9,10). However, the diagnostic performance of IVIM-DWIderived parameters in the differentiation of breast tumors is not consistent, and the application of this sequence remains debatable. For example, several studies (7,11,12) indicated that breast cancer had a higher D * value than did benign lesions, while other studies reported adverse (10,(13)(14)(15) or non-significant results (9,16,17). The studies of Cho et al. (9) and Lin et al. (13) suggested that the IVIM model can further distinguish invasive ductal carcinoma (IDC) from ductal carcinoma in situ (DCIS), while another study reported no significant difference in the D, D * , and f values between them (7). Last but not least, the small sample sizes in most studies were still insufficient to draw a robust conclusion for the performance of IVIM-DWI; therefore, clinical guidelines for the application in the breast have not been established. To address this problem, we perform a meta-analysis of all the published results regarding the diagnostic performance of IVIM-DWI in differentiating malignant and benign breast lesions. The controversial issues among the different studies will be addressed with more reliable evidence.

Data Sources
Studies on the differential diagnosis of breast tumors using IVIM-DWI parameters published in the past 10 years were systemically retrieved from PubMed, Embase and Web of Science by two senior librarians. A search formula was created using different combinations of medical subject headings or key words related to the following terms: IVIM, intravoxel incoherent motion, multiple b-values DWI, biexponential, true diffusion coefficient, pseudodiffusion coefficient, perfusion fraction, and breast or breast lesion/cancer/carcinoma. We also performed manual retrieval of the reference lists from the included studies.

Study Selection
Studies that met the following criteria were included: (a) the research purpose was to differentiate malignant and benign breast lesions using IVIM-DWI parameters; (b) the mean and standard deviation (SD) of each parameter were provided; (c) the diagnostic performance regarding

Data Extraction
A spreadsheet was used to extract the mean values and SDs as well as the diagnostic performance of the ADC, D, D * , and f values with threshold value, area under the curve (AUC), sensitivity, specificity or the TP, FN, FP, and TN in the respective study by one author and reviewed by another author. Other information included first author, publication year, country, field strength and vendors, b-values, patient ages, tumor size, and published journals.

Quality Assessment
The quality of the studies and likelihood of bias were evaluated using Review Manager 5.3 software (Cochrane Collaboration), referring to the Quality Assessment of Diagnostic Accuracy Studies-2 (18). We assessed the risk of bias and applicability in four domains: patient selection, index tests, reference standards, and flow and timing (19).

Publication Bias and Heterogeneity Evaluation
Because two parts of the data were pooled in our studyquantitative values and the diagnostic performance of each parameter, funnel plots and Begg's test were used to visually and quantitatively assess the publication bias for the continuous variables, whereas Deek's plot was used to assess the publication bias of the sensitivity and specificity with Stata version 12.0 (StataCorp). For an asymmetric or skewed funnel plot, P < 0.05 in Begg's test or Deeks' test, indicated the potential of publication bias (20). The inconsistency index (I 2 ) and Cochran's Q-tests were used to explore the heterogeneity of the included studies, with I 2 >50% or P < 0.05 for the Cochran Qtest suggesting statistically significant heterogeneity; in these instances, a random-effect model was applied for subsequent pooling, or a fixed-effect model when I 2 < 50% (21).

Data Synthesis
We constructed forest plots for continuous variables and calculated the standardized mean difference (SMD) between malignant and benign breast lesions using Review Manager 5.3 software. We used the bivariate mixed-effects binary regression model in Stata version 12.0 to pool the diagnostic performance with sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR), and AUC. The summary receiver operating characteristic (SROC) curves and Fagan's nomograms were also plotted to determine the diagnostic values and predict the post-test probabilities of the ADC, D, D * and f values in obtaining a differential diagnosis of breast tumors. Meta-disc 1.4 was used to evaluate the threshold effects by calculating the Spearman correlation coefficient (r) between the logit (TP rate) and logit (FP rate).

Literature Search and Selection
A flowchart detailing the study selection process is provided in Figure 1. Although the study by Iima et al. (22) included b-values of 2,000 and 2,500 s/mm 2 which may induce non-Gaussian diffusion, they used a hybrid model to sufficiently separate the non-Gaussian diffusion from IVIM effects. We also performed a sensitivity analysis and compared the pooled results between before and after excluding the study, the results were not significantly changed. Therefore, we

Quality Assessment
The distribution of the Quality Assessment of Diagnostic Accuracy Studies−2 scores for risk of bias and applicability Frontiers in Oncology | www.frontiersin.org

ADC Used for Diagnosis of Breast Tumor
Eight studies regarding ADC used in differentiating breast tumors were included for analysis. The χ 2 = 31.73 and P < 0.001 of the heterogeneity test (I 2 = 78%) suggested high heterogeneity among the included studies. The forest plot in Figure 3 shows the distribution of the ADC between malignant and benign breast lesions. A random-effects model generated an SMD of −1.38 (−1.76, −1.00) (P < 0.001) between malignant and benign breast lesions for ADC. The Begg's test suggested no publication bias relating to the ADC (P = 0.428).

D Value Used for Diagnosis of Breast Tumor
Ten studies regarding D value used in differentiating breast tumors were included for analysis. The χ 2 = 37.49 and P < 0.001 of the heterogeneity test (I 2 = 76%) suggested high heterogeneity among the included studies. The forest plot in Figure 4 shows the distribution of D between malignant and benign breast lesions.
A random-effects model generated an SMD of −1.50 (−1.85, −1.14) (P < 0.001) between malignant and benign breast lesions for D. The Begg's Test suggested no publication bias relating to D (P = 0.112).

D * Value Used for Diagnosis of Breast Tumor
Twelve studies regarding D * value used in differentiating breast tumors were included for analysis. The χ 2 = 123.02 and P < 0.001 of the heterogeneity test (I 2 = 91%) suggested high heterogeneity among the included studies. The forest plot in Figure 5 shows

f-Value Used for Diagnosis of Breast Tumor
Twelve studies regarding f value used in differentiating breast tumors were included for analysis. The χ 2 = 20.07 and P = 0.04 of the heterogeneity test (I 2 = 45%) suggested mild heterogeneity among the included studies. The forest plot in Figure 6 shows the distribution of f between malignant and benign breast lesions. A fixed-effects model generated an SMD of 0.89 (0.75, 1.02) (P < 0.001) between malignant and benign breast lesions for f value. The Begg's test suggested no publication bias in f (P = 0.880).

Subgroup Analysis for Histological/Molecular Subtypes
Because the treatment strategy and prognosis were different between DCIS and IDC, and because several studies have provided differential information between DCIS and IDC as well as other pathologic prognostic factors such as tumor size, lymph node metastasis, histologic grade, and the molecular expression of ER, PR, HER2, and Ki-67 in breast cancer, we further pooled these results. The pooled results were listed in Table 3

Diagnostic Performance
The diagnostic performance as assessed by pooling sensitivity, specificity, PLR, NLR, DOR and the AUCs of the ADC, D, D * , and f values are listed in Table 4.

Posttest Probabilities
The likelihood ratio and post-test probability are also important for diagnosing a disease (31), which estimated whether a patient was diagnosed with a certain disease using the MRI parameters. Figure 8 plotted the Fagan's nomograms of the ADC, D, D * , and f values for predicting post-test probabilities. All the pretest probabilities were set at 20% by default. We regarded the diagnosis of breast cancer as a positive event, corresponding to a higher f value and lower ADC and D values. Similarly, diagnosing benign lesions with a lower f value and higher ADC and D values represented a negative event. From a pre-test probability of 20%, the post-test probability increased to 48% with a PLR of 3.7 and decreased to 7% with an NLR of 0.30 based on the ADC. This indicated that the diagnostic probability for breast cancer will be obviously enhanced in cases with a lower ADC than in cases without an ADC measurement. By contrast, the probability of a breast cancer diagnosis will significantly drop from 20 to 7% when a negative event occurs (e.g., a higher ADC). Similarly,   when using D for predicting a diagnosis, the post-test probability of diagnosing breast cancer will reach 61% with a PLR of 6.1 and drop to 4% with an NLR of 0.17. The inclusion of f increases the post-test probability of diagnosing breast cancer to 46% with a PLR of 3.4 and decreases it to 6% with an NLR of 0.27. These data indicated that IVIM parameters, especially the D value, increased the accuracy for diagnosing breast cancer.

DISCUSSION
IVIM-DWI is a non-invasive technique that shows superiority in reflecting tumor cellularity and perfusion without the need for contrast agent. It has already been applied in the differentiation of lung nodules (32), thyroid nodules (33), prostate (34) and brain tumors (35) with good diagnostic performance. Although IVIM has become a research hotspot in whole-body tumors, especially in breast tumors, to the best of our knowledge, there is still no study on breast tissues with a sufficient sample size to establish the value of IVIM for quantitatively distinguishing breast cancer from benign lesions and their molecular subtypes. Our study provides a timely summary of this issue by pooling all published evidence with strict inclusion criteria and quality assessments. The results showed a promising prospect for incorporating IVIM-DWI into MRI protocols for the breast. In our study, the SMDs suggested that malignant breast tumors demonstrated lower ADC and D values and higher f values than did benign lesions. Breast cancer usually has dense cellularity with a high capacity for proliferation, which may reduce the extracellular space and limit the diffusion of water molecules thus causing a reduction in the diffusion coefficient. The pooled results also suggested that the D value improved the diagnostic performance with a slightly higher sensitivity, specificity, AUC, DOR and post-test probability than conventional ADC. Theoretically, the monoexponential model may miscalculate the water molecule movement in conjunction with microcirculation perfusion and therefore overestimate the ADC value (14). The D value can precisely calculate the true diffusion without the influence of perfusion-related diffusion (15), but a larger number and higher b-value applied in the IVIM model will significantly prolong the scanning times and introduce motion and susceptibility artifacts.
Interestingly, malignant breast tumors demonstrated a significantly higher f value but a non-significantly higher D * value than did the benign lesions. This mainly arose from increased angiogenesis in breast cancer (14). The f value also demonstrated a higher specificity of 0.76 and an AUC of 0.85 compared with the specificity of 0.59 and AUC of 0.71 for the D * value. In addition, the mean D * values of breast cancer ranged from 3.85 to 109.78 × 10 −3 mm 2 /s with a huge SD among the included studies, which indicated that the D * value was not robust and could not further increase the diagnostic sensitivity and specificity; however, the f value was able to more accurately reflect tissue perfusion. Liu et al. (14) also stated that the D * value may be unreliable in the IVIM model due to the low signal-to-noise ratio and the poor measurement reproducibility.
The pooled results indicated lower ADC and D values in IDC than in DCIS, suggesting denser cellularity and a more limited extracellular volume fraction in IDC with more aggressive features (7). Therapeutic strategies and treatment efficacy are closely related to intrinsic biological subtypes of breast cancer (17). Our pooled results suggested that the lesions with metastatic lymph nodes had higher D * and f values than did lesions without lymph node metastasis. The IVIM model provides a surrogate marker for predicting lymph node status, and rich tumor perfusion owing to neovascularization may facilitate lymphatic metastasis (36). The results also suggested greater tumor perfusion (D * and f value) in HER2-positive cancer. HER2 is an important prognostic factor of breast cancer and is closely correlated with tumor proliferation, invasion and metastasis.   It can promote tumor angiogenesis and lymphangiogenesis via regulation of vascular endothelial growth factor (VEGF) in breast cancer and therefore improve tumor perfusion (10). Our study also suggested that breast cancer with high Ki-67 expression has a significantly lower D value (P = 0.002) instead of ADC value (P = 0.27), which was mainly due to active proliferation and a higher cell density. The results suggested that the D value better reflected Ki-67 status than did ADC values when assessing the cell density and proliferation status. Our results also suggested a significant difference in the perfusion-related parameters (D * ) between ER or PR statuses. The detection of ER and PR is of great significance for estimating the prognosis of breast cancer and guiding endocrine therapy, as patients with positive ER and PR expression showed high responsiveness to hormone therapies. Previous studies have reported that positive ER and PR expression inhibited tumor angiogenesis by decreasing the level of VEGF (7,10,37,38), which leads to lower D * value in ER-and PR-positive tumors.
The correlation results suggested no significant threshold effects in the ADC (r = −0.100, P = 0.873), D (r = 0.342, P = 0.452), D * (r = −0.029, P = 0.957) and f values (r = 0.829, P = 0.524); thus, they are not the main contributors to the heterogeneity. The ADC, D, D * , and f values all demonstrated obvious heterogeneity, which should be further explored. First, most of the included studies did not control for age or menstrual cycle for analysis, which may have introduced heterogeneity. Second, 1.5T and 3.0T MR scanners with various combinations of b-values were used to perform IVIM-DWI in these studies, which may influence the accuracy of the calculations of diffusion and perfusion coefficients. Third, the post-processing methods were different, as some studies (9,26) performed histogram analyses for the whole lesions, while the others assessed the lesions at the largest section as the region of interest. Last, the tumor subtypes were inconsistent in the malignant and benign groups; this may result in different biological characteristics and consequent variations in the IVIM values.
There were several limitations to this meta-analysis. First, the small number of studies regarding the histological/molecular subtypes of breast cancer was still insufficient to draw a robust conclusion. Second, we did not perform a horizontal comparison with other diffusion imaging techniques, such as diffusion tensor imaging and diffusion kurtosis imaging, both of which provide information that reflects directional characteristics and tissue complexity. A combination of these sequences may further improve the specificity in characterizing breast lesions.

CONCLUSIONS
IVIM-DWI parameters were adequate and superior to the ADC in differentiating breast tumors. They can further differentiate IDC from DCIS. Besides, IVIM-derived parameters also showed unique superiority in identifying lymph node metastasis, histologic grade, and hormone receptor, and HER2 and Ki-67 status. It is quite suitable when making treatment plans and prognosis assessments.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/supplementary material.