XGboost Prediction Model Based on 3.0T Diffusion Kurtosis Imaging Improves the Diagnostic Accuracy of MRI BiRADS 4 Masses

Background The malignant probability of MRI BiRADS 4 breast lesions ranges from 2% to 95%, leading to unnecessary biopsies. The purpose of this study was to construct an optimal XGboost prediction model through a combination of DKI independently or jointly with other MR imaging features and clinical characterization, which was expected to reduce false positive rate of MRI BiRADS 4 masses and improve the diagnosis efficiency of breast cancer. Methods 120 patients with 158 breast lesions were enrolled. DKI, Diffusion-weighted Imaging (DWI), Proton Magnetic Resonance Spectroscopy (1H-MRS) and Dynamic Contrast-Enhanced MRI (DCE-MRI) were performed on a 3.0-T scanner. Wilcoxon signed-rank test and χ2 test were used to compare patient’s clinical characteristics, mean kurtosis (MK), mean diffusivity (MD), apparent diffusion coefficient (ADC), total choline (tCho) peak, extravascular extracellular volume fraction (Ve), flux rate constant (Kep) and volume transfer constant (Ktrans). ROC curve analysis was used to analyze the diagnostic performances of the imaging parameters. Spearman correlation analysis was performed to evaluate the associations of imaging parameters with prognostic factors and breast cancer molecular subtypes. The Least Absolute Shrinkage and Selectionator operator (lasso) and the area under the curve (AUC) of imaging parameters were used to select discriminative features for differentiating the breast benign lesions from malignant ones. Finally, an XGboost prediction model was constructed based on the discriminative features and its diagnostic efficiency was verified in BiRADS 4 masses. Results MK derived from DKI performed better for differentiating between malignant and benign lesions than ADC, MD, tCho, Kep and Ktrans (p < 0.05). Also, MK was shown to be more strongly correlated with histological grade, Ki-67 expression and lymph node status. MD, MK, age, shape and menstrual status were selected to be the optimized feature subsets to construct an XGboost model, which exhibited superior diagnostic ability for breast cancer characterization and an improved evaluation of suspicious breast tumors in MRI BiRADS 4. Conclusions DKI is promising for breast cancer diagnosis and prognostic factor assessment. An optimized XGboost model that included DKI, age, shape and menstrual status is effective in improving the diagnostic accuracy of BiRADS 4 masses.


INTRODUCTION
Breast cancer (BC) is the most common cancer and a leading cause of female mortality worldwide (1). It presents substantial heterogeneity in histology, clinical presentation and therapy response. Four major BC subtypes can be defined by gene expression profiling: luminal A, luminal B, HER2-enriched, and basal-like (triple-negative BC, TNBC) (2,3). Among patients with BC, recurrence-free and overall survival are thought to be related to the histological grade, Ki-67 expression, the status of lymph node (LN), estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2) (4).
Magnetic resonance imaging (MRI), a non-invasive modality that provides an excellent soft-tissue contrast with high sensitivity, is well-established for BC characterization, treatment planning, and post-operative prognostication (5). Dynamic contrast-enhanced MRI (DCE-MRI), which enables detailed morphologic and haemodynamics evaluations through pharmacokinetic modeling techniques, has been widely used for BC diagnosis and monitoring tumor's response to chemotherapy (6)(7)(8)(9)(10). However, the diagnostic specificity of DCE-MRI for BC varies greatly due to background parenchymal enhancement and overlapping of the time-intensity curves between benign and malignant lesions, which leads to unnecessary biopsies (8). Furthermore, DCE-MRI may not be appropriate for patients who are allergic to contrast agents or have liver or kidney dysfunction. In vivo proton MR spectroscopy ( 1 H-MRS) provides molecular and biochemical information on tumor classification based on the observation of total choline (tCho) levels (11). However, tCho provides limited sensitivity for differentiation between breast lesion types (12). Apparent diffusion coefficient (ADC) derived from diffusion-weighted imaging (DWI) assumes an ideal Gaussian distribution of water displacement without any restriction (13,14), it's another non-contrast MRI modality to assess complex tissue microstructural features (15,16). However, water diffusion in living tissues is generally restricted due to the complex microstructural environment, including the presence of cell membranes and other organelles, and thus tends to deviate from a Gaussian distribution (17). Diffusion kurtosis imaging (DKI) follows a non-Gaussian distribution and is considered useful for characterizing heterogeneous tumors. This modality was introduced by Jensen et al., and included parameters of mean kurtosis (MK) and mean diffusivity (MD) (18). In this context, greater MK or lower MD values suggest more restrictions to normal water diffusion and greater tissue complexity (19). Previous DKI applications revealed its greater sensitivity over ADC for characterizing hepatocellular carcinoma, glioma, BC, and prostate cancer (20)(21)(22)(23). Sun et al. observed using 1.5-T imaging on BC that greater MK was significantly associated with higher histological grade and elevated Ki-67 expression (24). Similarly, our preliminary work on 3.0-T MRI revealed the usefulness of MK for breast lesions (BLs) characterization (25).
Breast MRI findings using the Breast Imaging Reporting and Data System (BiRADS) lexicon descriptors provide a standardized language to define the final assessment categories for predicting the likelihood of malignancy, allowing radiologists to communicate important findings in a consistent and repeatable manner (26,27). However, the malignant probability of MRI BiRADS 4 ranges from 2% to 95% (28), leading to unnecessary biopsies and even causing huge anxiety for patients.
XGboost is a more powerful version of the Gradient Boosting Decision Tree (GBDT) algorithm (29), in which a second-order Taylor expansion is performed on the square loss function to achieve better accuracy.The objective function is defined as follows: Here g i and h i are first and second order gradient statistics on the loss function. n represents the numbers of samples. f t (x i ) represents the regression tree functions at the t-th iteration. T represents the number of leaves in the tree. w 2 j represents L2 norm of leaf scores. W(f t ) the regularization term based on complexity of the model (i.e. number and weight of leaf nodes), which effectively prevents overfitting. In addition, XGboost adopts shrinking and column subsampling techniques to improve the generalization and learning speed of the algorithm. These advantages make XGboost a widely accepted model in many machine learning and data mining applications (30,31).
Given that MK derived from DKI is a promising imaging marker for predicting the aggressiveness of tumors according to previous preliminary studies (24,25), and XGboost is a scalable machine learning system for tree boosting. Thus, the purpose of this study involved in larger sample size was to construct an optimal XGboost prediction model through a combination of DKI independently or jointly with other MR imaging features and clinical characterization, which was expected to reduce the false positive rate of MRI BiRADS 4 and improve diagnosis efficiency of BC. So far, no study has reported that the XGboost model based on DKI improves the diagnostic specificity of MRI BiRADS 4. This information could help radiologists provide referring clinicians with a promising prediction model to increase the diagnostic accuracy of MRI BiRADS 4, thereby preventing unnecessary biopsies and optimizing personalized diagnosis and treatment.

Patients
This study protocol was approved by our Institutional Review Board, and written informed consent was obtained from all patients. 120 patients (median age: 44 years, range: 17-71 years) with 158 BLs were recruited from the Department of General Surgery of our hospital between October 2018 and June 2021. Patients were excluded based on the flowchart as shown in Figure 1. Twenty-two patients had two or more neoplasms, and each lesion was examined separately. 158 BLs were divided into a training group (n=108, malignancy=53, benign=55) and a validation group (n=50, malignancy=25, benign=25). The training group, consisting of 6 BiRADS 2 masses, 56 BiRADS 3 masses, 12 BiRADS 4 masses and 34 BiRADS 5 masses, was used to construct an XGboost diagnostic model. The validation group, included 50 BiRADS 4 masses, was used to verify the diagnostic performance of the XGboost model in BiRADS 4 masses. The BiRADS classification of MRI was evaluated based on the morphology findings, dynamic enhancement pattern and ADC measurement of lesion, according to the American College of Radiology BiRADS 5 th version for breast MRI (26,27).

MRI Protocol
All MRI examinations were performed using a 3.0-T MR scanner (GE Medical System, Milwaukee, WI, USA) with a dedicated fourchannel bilateral breast coil. Premenopausal women were examined in the prone position after the first week of their menstrual cycle. Following a T1-weighted FSE-XL sequence and a T2-weighted FRFSE-XL sequence, the routine DWI and DKI by an echo-planar imaging sequence, 1 H-MRS as well as DCE imaging were performed. The protocol parameters were shown in Table 1. A cubic region of interest (ROI, 1-6 cm 3 ) was positioned inside the lesion for 1 H-MRS acquisition, with 4 saturation bands. An automatic shimming adjustment of the unsuppressed water signal was performed to reach water linewidths of 10-20 Hz.

MRI Analysis
All raw diffusion imaging data were post-processed using Functool 9 software, which is integrated into the MR imager (GE Medical System, Ruede la Minière, France). This process automatically generates the imaging metrics of DWI and DKI. ADC maps were generated from DWI using b values of 0 and 800 s/mm 2 , considering all 3 diffusion gradient directions. MK and MD maps were derived from DKI using b values of 0, 500, 1,000, 1,500, 2,000, and 2,500 s/mm 2 , considering all 15 diffusion gradient directions. ROIs (mean size: 94 ± 33 mm 2 , range: 60-380 mm 2 ) for each lesion were manually drawn on three different solid neoplastic regions while avoiding necrotic tissue, hemorrhagic components, and dominant ducts. Average ADC, MK, and MD values were subsequently calculated. LCModel software (Canada) was used to identify the tCho peak at 3.23 ppm in the breast spectrum inside the lesion. DCE-MRI data were post-processed using GenIQ software integrated into the MR imager. Taking the standard map and modified Tofts model as the mathematical model, the functional maps of volume transfer constant (K trans ), flux rate constant (K ep ) and extracellular volume fraction (V e ) were obtained in at least two-thirds of the breast lesions.
The characteristics of the lesions included size, shape, margin, BiRADS categories and imaging parameters were analyzed by two senior radiologists specialized in breast imaging (5 and 10 years of diagnostic experience) and who were blinded to the histopathological diagnosis. The size of the lesions was measured in MR imager on the last phase in DCE-MRI. Intra-class correlation coefficients (ICCs) were used to assess the consistency of parameters calculated twice by the radiologists, and "good" correlation was defined as an ICC >0.75.

Pathology and Immunohistochemistry
The histopathological findings were analyzed by two experienced pathologists (5 and 10 years of pathologically diagnosing breast tumors), blinded to the patients' clinicopathological characteristics. BC histological grades were based on the modified Bloom-Richardson guidelines (grades 1-2: low-grade disease, grade 3: high-grade disease) (32). LN status was determined by the postoperative histopathological examination. Positive results for ER and PR expression were defined as positive staining for ≥1% of nuclei in 10 high-power fields (33). The positive result for HER2 expression was defined as an immunohistochemical result of 3+ or if gene amplification was observed via fluorescence in situ hybridization (34). The Ki-67 nuclear protein reflects cell proliferation and its expression was scored as the percentage of tumor cell nuclei with positive immunostaining based on a threshold value of 14% (high Ki-67 expression: >14%) (35,36).

Statistical Analysis
The Kolmogorov-Smirnov test was initially used to analyze for normal distributions of variables and then the Levene test to examine the homogeneity of variance. Wilcoxon signed-rank test and c2 test were used to evaluate continuous and categorical variables, respectively. Receiver operating characteristic (ROC) curve analysis was used to evaluate the performances of the imaging parameters for diagnosing BC and predicting histopathological findings, with excellent diagnostic ability defined as an area under the ROC curve (AUC) >0.8. Spearman correlation analysis was used to evaluate the associations between the imaging parameters and the BC prognostic factors and molecular subtypes. Correlations were classified based on the correlation coefficient as excellent (≥0.75), moderate to good (0.50-0.74), fair (0.25-0.49), and small (≤0.24).

XGBoost Model Construction and Verification
The XGBoost model was constructed and verified in four stages: I. Significant features were selected based on Wilcoxon signed-rank test and c2 test. II. The best imaging parameters were selected based on the ROC curve. The optimal combination of imaging parameters and clinical features were selected based on the least absolute shrinkage sum selection operator (LASSO). III. Representative features were used to construct the XGBoost model and to derive feature importance scores, where the optimal feature subset was determined by 3-fold cross-validation performed 50 times. IV. BiRADS 4 lesions in the validation group were used to verify the effectiveness of the model.
There were no significant differences regarding MK, MD, or ADC values according to LN involvement, and PR, ER, or TNBC status ( Table 4). Pharmacokinetic parameters, such as V e , K ep , and K trans , also did not show statistical differences in predicting prognostic factors and molecular subtypes.

Construction and Validation of an Optimal XGboost Model for Predicting BC in BiRADS 4 Masses
Univariate analysis revealed the following significant risk factors for BC: patient age, menstrual status, lesion size, lesion shape, lesion margin status, ADC, MK, MD, tCho, K ep and K trans values, as shown in Table 2. According to lasso regression and single factor ROC value comparison, 5 non-enhanced features including MD, MK, age, shape and menstrual status were selected to be the optimized feature subset to construct a XGboost model, in which MD and MK were the most significant features, with the importance scores of 24 and 20 respectively ( Figure 5A). This XGboost model exhibited   superior diagnostic ability for BC characterization in the training group, with a ROC value of 0.940 ( Figure 5B). To further verify the predictive reliability of this model for BC diagnosis, 50 BiRADS 4 masses in the validation group were introduced into this model, of which 21 cases were correctly diagnosed as malignant and 22 cases were correctly diagnosed as benign ones, with the diagnostic sensitivity and specificity of 84% and 88%, respectively ( Table 5).

DISCUSSION
This study demonstrated that MK derived from DKI was performed better than MD, ADC, V e , K ep and K trans for differentiating between benign and malignant BLs. Also, MK was shown to be more strongly correlated with histological grade, Ki-67 expression and LN status, and was proved to be a promising imaging marker for predicting the clinical and pathological characteristics of BC. Finally, an optimized XGboost model was constructed by combining MD, MK, age, shape and menstrual status, which exhibited superior diagnostic performance for BC characterization and an improved assessment of suspicious breast tumors in BiRADS 4. Overall, we provide a novel and minimally invasive means by using DKI as relevant predictors for diagnosing and determining the microstructural characteristics of BCs. The rapid proliferation of different cell types makes BC a highly heterogeneous cancer, which may be reflected in the elevated MK  The bold values and the symbol * are all for marking the Significant statistical difference.
values and the concurrently decreased MD and ADC values. MK derived from DKI quantifies the degree that water diffusion deviates from Gaussian diffusion and reflects the tissue complexity, which is considered proportional to the neoplasm's cellular microstructural heterogeneity and tissue complexity. MD is a corrected diffusion coefficient that removes non-Gaussian bias.
In malignant tissues, water molecule diffusion is usually restricted by intracellular, extracellular, and intravascular spaces, as well as by tightened cellular membrane microstructures, leading to lower ADC and MD values. Here, MK was superior for distinguishing between malignant and benign BLs, over MD and ADC. This finding might be explained by ADC relying on an assumption of an ideal Gaussian distribution of unrestricted water diffusion, while the DKI technique assumes that water diffusion follows a non-Gaussian distribution, which is better to explain tissue complexity or physical barriers to diffusion within tissue (cell membranes, organelles, stromal desmoplasia, and so forth) (8). Single-voxel 1 H-MRS-based tCho peak detection was also evaluated, although it was less effective than MK for differentiating between malignant and benign BLs. The low sensitivity of 1 H-MRS might be explained by various factors, including the need for high-quality shimming and fat-suppression. Poor shimming results in B 0 field inhomogeneities that broaden spectral line widths, causing a reduction of SNR and the ability to separate different chemical resonances. Moreover, it also compromises the performance of chemically selective fat and water suppression in localized MRS. Without appropriate fat suppression, lipid sidebands can obscure choline peaks in the spectra. What's more, breast MRI presents low sensitivity for detecting choline levels in smaller lesions (<10 mm), which limits the applicability of MRS in this model. This study also revealed higher MK, but lower ADC and MD values, for BC cases that involved high-grade disease and high Ki-67 expression. Similar results have been observed in previous studies (24,25). Ki-67 expression is a biomarker for cell proliferation, with high expression suggesting increased cellularity, vascular hyperplasia, and necrosis. High-grade tumors are also characterized by active mitosis and the absence of normal glandular architecture, which is correlated positively with high Ki-67 expression (36,37). These changes reflect BCrelated hypercellularity and increased microstructural complexity, leading to higher kurtosis and lower diffusivity. Nevertheless, MK exhibited a strongly correlation with Ki-67 expression and Histological grade, suggesting that DKI is a valuable tool for characterizing BC.
MK was significantly higher in tumors with axillary LN involvement than those without axillary LN involvement in this study, which agrees with the findings by Huang et al. (25) but conflicts with those by Sun et al. (24). This discrepancy might owe to differences in tumor size, as our study involved a greater number of larger lesions, which may tend to be more heterogeneous. This study failed to detect significant differences in any of the imaging parameters according to TNBC or non-TNBC type. This finding might be related to the fact that ER expression inhibits angiogenesis, which might restrict water diffusion in ER-positive BC, while HER2 and PR expression can increase angiogenesis (38). Therefore, ER, PR, and HER2 expression might influence angiogenesis by regulating vascular endothelial growth factor production at different levels in BC. Nevertheless, tumor heterogeneity also likely contributes to the lack of a clear relationship between the BC subtype and imaging parameters.
DCE-MRI, making use of Tofts two-compartment model, quantifies the contrast agent exchange between the intravascular and the interstitial space, providing measurements of tumor blood flow, the microvasculature, and capillary permeability.  The pharmacokinetic parameters can potentially improve the differentiation of benign and malignant breast tumors and distinguish different breast cancer subtypes (39). This study revealed that there were significant differences in K trans and K ep between benign and malignant tumors, which agree with the previous study by Li et al. (6). However, no significant differences were observed between pharmacokinetic parameters and prognostic factors of BCs, and the diagnostic efficiency of pharmacokinetic parameters was also lower than that of MD and MK. The possible reason might be the fact that there were only two BCs which exhibited a wash-out type of dynamic enhancement pattern. XGboost algorithm has been widely used in medical fields, such as Chronic Kidney Disease Diagnosis (30), COVID-19 (40), etc. Hou et al. (31) compared the diagnostic efficiency of the logistic regression model, SAPS-II score prediction model and XGboost algorithm model in predicting 30-days mortality for MIMIC-III patients, and found that the XGboost model performed the best, indicating its great potential in medical applications. As can be seen in Figure 5 and Table 5, the optimized XGboost model in our study exhibited superior diagnostic ability for BC characterization in both the test group and the validation group. In particular, this model improved the diagnostic specificity of BiRADS 4 tumors, suggesting its potential usefulness in reducing the number of unnecessary biopsies, as well as reducing anxiety of patients and waste of medical resources in the long term.
Our study had some limitations. First, the sample size was relatively small, and only a few patients had PR-positive tumors or the TNBC type. Thus, using a larger sample with multiple histological BC types may yield a more accurate estimate. Second, the low spatial resolution of DWI might lead to inaccurate measurements of small benign lesions (< 1 cm). Hence, an improved high-resolution sequence for DWI might be required to detect small lesions. Third, all parameters were calculated on the same MR scanner, and our findings might be specific to the sequences we used.
In conclusion, our study demonstrated that DKI is promising for breast cancer diagnosis and prognostic factor assessment. An optimized XGboost model that included DKI, age, shape and menstrual status is effective to improve the diagnostic specificity of BiRADS 4 masses, thereby preventing unnecessary biopsies and optimizing personalized diagnosis and treatment. However, a multicenter prospective study with a larger cohort should be performed in the near future to validate these results.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material. Further inquiries can be directed to the corresponding author.

ETHICS STATEMENT
Written informed consent was obtained from the individual(s), and minor(s)' legal guardian/next of kin, for the publication of any potentially identifiable images or data included in this article.

AUTHOR CONTRIBUTIONS
WT and HZ designed the experiments. TQ, WT, and HZ analyzed the data. WT, HZ, XC, and HNZ contributed to MRI data acquisition. WT and HZ wrote the paper. YL and RW supervised the experiment. All authors contributed to the article and approved the submitted version.