Non-Mass Enhancements on DCE-MRI: Development and Validation of a Radiomics-Based Signature for Breast Cancer Diagnoses

Purpose We aimed to assess the additional value of a radiomics-based signature for distinguishing between benign and malignant non-mass enhancement lesions (NMEs) on dynamic contrast-enhanced breast magnetic resonance imaging (breast DCE-MRI). Methods In this retrospective study, 232 patients with 247 histopathologically confirmed NMEs (malignant: 191; benign: 56) were enrolled from December 2017 to October 2020 as a primary cohort to develop the discriminative models. Radiomic features were extracted from one post-contrast phase (around 90s after contrast injection) of breast DCE-MRI images. The least absolute shrinkage and selection operator (LASSO) regression model was adapted to select features and construct the radiomics-based signature. Based on clinical and routine MR features, radiomics features, and combined information, three discriminative models were built using multivariable logistic regression analyses. In addition, an independent cohort of 72 patients with 72 NMEs (malignant: 50; benign: 22) was collected from November 2020 to April 2021 for the validation of the three discriminative models. Finally, the combined model was assessed using nomogram and decision curve analyses. Results The routine MR model with two selected features of the time-intensity curve (TIC) type and MR-reported axillary lymph node (ALN) status showed a high sensitivity of 0.942 (95%CI, 0.906 - 0.974) and low specificity of 0.589 (95%CI, 0.464 - 0.714). The radiomics model with six selected features was significantly correlated with malignancy (P<0.001 for both primary and validation cohorts). Finally, the individual combined model, which contained factors including TIC types and radiomics signatures, showed good discrimination, with an acceptable sensitivity of 0.869 (95%CI, 0.816 to 0.916), improved specificity of 0.839 (95%CI, 0.750 to 0.929). The nomogram was applied to the validation cohort, reaching good discrimination, with a sensitivity of 0.820 (95%CI, 0.700 to 0.920), specificity of 0.864 (95%CI,0.682 to 1.000). The combined model was clinically helpful, as demonstrated by decision curve analysis. Conclusions Our study added radiomics signatures into a conventional clinical model and developed a radiomics nomogram including radiomics signatures and TIC types. This radiomics model could be used to differentiate benign from malignant NMEs in patients with suspicious lesions on breast MRI.


INTRODUCTION
According to the American College of Radiology (ACR) BI-RADS ® Atlas, 5th edition (1), breast lesions with abnormal enhancement variables on dynamic contrast-enhanced breast magnetic resonance imaging (breast DCE-MRI) include foci, masses, and non-mass enhancement lesions (NMEs). In 2020, breast cancer became the most common cancer of women worldwide (2), and the differentiation between benign and malignant breast lesions using MRI-based diagnostics was found to be critical for breast cancer treatments. However, distinguishing benign and malignant breast lesions on DCE-MRI is challenging, especially when NMEs are present (3).
NMEs are associated with a wide-ranging spectrum of different pathologic findings (4)(5)(6), with an overlap in the imaging findings between malignant and benign lesions. NMEs remain a diagnostic challenge for radiologists despite the frequent attempts to distinguish benign from malignant NMEs using different methodologies, including conventional morphologic comparisons (6)(7)(8) and the measurement of different parameters, such as ADC values and the initial slope of kinetic curves (9)(10)(11). Baltzer et al. reported that the primary cause for false positive results of breast MRI may due to NMEs, resulting in unnecessary biopsies (12). Studies have shown that morphologic assessments are disputable in attempting to differentiate benign vs. malignant NMEs. Some studies have demonstrated that morphologic assessments are more useful than kinetic assessments in distinguishing NMEs (13)(14)(15), while other studies have reported that morphologic assessments have a relatively low specificity and sensitivity to distinguish NMEs (16)(17)(18). In addition, morphologic assessments depend on the human eye are subjective with limitations; thus, substantial inter-and intra-observer variability is seen with these assessments (19). A meta-analysis (20) showed heterogeneity among studies with sensitivities from 0% to 100% and specificities from 48% to 100%. These factors underscore the complexity of the diagnostic phase and simultaneously present a therapeutic challenge. For example, idiopathic granulomatous mastitis, a benign inflammatory disease, can mimic breast cancer, both clinically and radiologically (21,22).
In recent years, radiomics, a technology of transforming digital medical images into quantifiable data to improve medical decisions (23), has been found to have a potential benefit in increasing the knowledge base of diagnostic oncology and predicting the accuracy of medical imaging. Radiomics is partially based on the hypothesis that medical images contain much more information than can be visually deciphered by radiologists (24). According to our best knowledge, there is little research reported the additional value of radiomics to differentiate benign vs. malignant NMEs on DCE-MRI. Additionally, to date, a model that combines a radiomics signature and conventional analysis to produce superior diagnostic performance in diagnosing malignant NMEs has yet to be reported.
In this study, we developed and validated a nomogram that combined radiomics and conventional analytic clinical factors to evaluate the additional value of radiomics in differentiating benign from malignant NMEs. We also compared the diagnostic performance of the nomogram with the radiomics score and analytic clinical factors alone.

Patients
We retrospectively reviewed 3352 consecutive patients who underwent breast MRI in our hospital between December 2017 and October 2020. In total, 232 female patients with 247 lesions were selected and comprised the primary training cohort (mean age, 44.8 ± 10.6 years). Among these patients, 14 had additional lesions in the contralateral breast and 1 patient had two lesions in different quadrants of her left breast. The inclusion criteria were as follows: (a) histologically confirmed benign or malignant breast lesions on DCE-MRI examinations; (b) no previous treatments or breast implants; (c) no pregnancy or lactation; and (d) NMEs found on DCE images. Patients were excluded if image quality was poor, hemorrhage was present after biopsy, lesions did not involve parenchyma on the DCE images, or the lesion sizes were <5 mm. Using this inclusion and exclusion criteria, a validation cohort of 72 consecutive female patients (mean age, 47.9± 11.2 years) was selected from 908 consecutive patients between November 2020 and April 2021 in our hospital. A flowchart of this study is presented in Figure 1. For each patient, conventional clinical data, including age and menopause status, were obtained from electronic medical records.

Magnetic Resonance Image Acquisition
MR examinations for both the validation cohort and training cohort were obtained on a 3T scanner (MAGNETOM Skyra, Siemens Healthcare, Erlangen, Germany) in our hospital. All scans were performed with a dedicated 16-channel phased-array breast coil in the prone position using the same protocol.
The contrast medium (Omniscan, GE Healthcare, Milwaukee, WI) was intravenously injected with a power injector at the end of the third acquisition phase. The dose was 0.1 mmol/kg body weight, with an injection rate of 2.5 mL/s, which was followed by a 20 mL saline flush.

Image Interpretation
For each patient in the training cohort and the validation cohort, two radiologists (Y.L. and T.A. with 8 and 10 years of experience in breast MRI, respectively), were blinded to the pathologic results. Each radiologist reviewed all breast MR images from the 304 patients, assessing breast density, the degree of background parenchymal enhancement, and MRreported lymph node status by consensus. The maximal diameter, internal enhancement, and distribution were recorded in the very early phase (about 90 seconds) after contrast media injection according to the BI-RADS 5 th edition (1). Of these, the maximal diameter was assessed on multiplanar reformatted images using a Siemens clinical workstation. The type of time-intensity curve (TIC) for each case was drawn based on DCE-MRI with a region of interest (ROI) of approximately 0.2-0.4 cm 2 placed on each slice at the brightest part of the lesions on images obtained in the early phase after the contrast injection. We recorded the high-level TIC curve types when different types were present in each lesion. On all slices of the apparent diffusion coefficient (ADC) maps, multiple ROIs were carefully placed on the darkest areas, which were confirmed by agreement by the two radiologists. Thus, the lowest ROI ADC value was regarded as the minimum ADC value for each lesion. If no lesions could be evaluated with DWI or the ADC maps, we copied ROIs on the DCE-MRI image and pasted them on the ADC maps. We defined the axillary lymph node (ALN) with a maximal short diameter of ≥10mm, an absent fatty hilum, or a long axis/short axis of <2 as MR-reported ALN positive. Vasodilation of the surrounding feeding artery was defined as positive on maximum intensity projection images (MIPs) and was included based on our experience. The above-mentioned factors were all initial clinical candidate predictors for NME differentiation.

Features Extraction and Radiomics Signature
The radiomics signature was applied to the clinical analyses, and a diagnostic model for differentiation was developed using the training cohort. The radiomics analysis was performed on the very early phase (90 seconds) images after contrast media injection, as was the morphologic evaluation. Prior to the FIGURE 1 | Flowchart of the study population enrollment. NME, non-mass enhancement lesion. radiomics analysis, the images of each case were transferred into the open-source software, ITK-SNAP (Version 3.8.0), to perform semi-automatically ROI segmentation. ROIs were drawn with care to include the whole lesion, avoiding normal glandular tissue, fat, vessels, and necrosis. Pyradiomics open-source software (https://pyradiomics.readthedocs.io/en/latest/index. html) was used to automatically extract tissue intensities and textural, morphologic, and wavelet features. We used the least absolute shrinkage and selection operator (LASSO) method, an appropriate tool for high-dimensional data regression (26), to select the most effective features from the training cohort data set. For each lesion, a radiomics score (Rad-score) was calculated weighting by the respective coefficients of selected features.

Nomogram in the Training Cohort and Validation
Initial clinical multivariate logistic regression analysis included age, menopause status, maximal diameter, fibrotic gland tissue, background parenchymal enhancement, morphologic assessment, ALN status, and TIC assessment on DCE-MRI and the minimum ADC values on DWI. We added radiomics features into the clinical multivariable logistic regression analysis and built the radiomics nomogram to supply the radiologists and clinicians with an effective tool for differentiating benign and malignant NMEs. The calibration curve and Hosmer & Lemeshow test (27) were adapted to evaluate the radiomics nomogram calibration. Nomogram performance was evaluated using the area under the curve (AUC) analysis.

Consistency Validation
In the data set of the training cohort, consistency validation was performed by comparing the first measurement and second measurement one month later of reader 1 (Y.L.) for intraobserver agreement. The second measurement of reader 1 and the extraction of reader 2 (Z.L.Y) in 60 patients were compared to produce inter-observer agreement. The interclass correlation coefficient (ICC) was applied to assess the feature extraction agreement, which was greater than 0.80 and considered excellent.

Data Validation
We applied the same method as that of the training cohort to calculate the Rad-score in the validation cohort. We applied the logistic regression equation produced in the training cohort to all lesions of the validation cohort. We tested the performance of the nomogram using calibration and AUC analyses.

Statistical Analysis
R (RStudio, Version 3.6.3) software was used for algorithms and statistical analyses. For continuous variates, Student's t-tests were performed. For categorical variates, the chi-square test or Wilcoxon rank-sum test were applied. We used univariate logistic regression analysis to determine potential factors affecting differentiation. Then, logistic regression models containing the above-mentioned potential factors were used for multivariate analysis. A nomogram was built on the logistic regression model as a graphical presentation. The area under the receive operating characteristic (AUC-ROC) curve, accuracy, sensitivity, and specificity were applied to indicate the discriminative ability of each factor and nomogram. P-values <0.05 (two-tailed) was considered statistically significant.

Training Cohort
In the training cohort, of the 247 lesions, 191 malignant and 56 benign lesions were confirmed pathologically by either biopsy, lumpectomy, or mastectomy. For the patient who had two lesions in the left breast, the lesion in the upper outer quadrant was confirmed as adenosis, while the lesion in the medial area was ductal cancer in situ. Specific pathologic results are shown in Table 1. Internal enhancement patterns, background parenchymal enhancements (BPEs), and MRI reportedfibroglandular tissue (FGT) were not different between malignant and benign lesions (P=0.397, 0.760, 0.139). The mean age of the patients with malignant lesions was older than that of the benign cases (P=0.035). The maximal diameter of the malignant lesions was significantly longer than that of the benign lesions (P<0.001). A higher proportion of postmenopausal women were found in the malignant group than in the benign group (P=0.034). The constituent ratio of distribution was significantly different between malignant and benign cases (P<0.001). Of these, the proportion with linear distributions was higher in the benign group than in the malignant group (P=0.046). The minimum ADC value of the malignant lesions was significantly lower than that of the benign lesions (P<0.001). The malignant group had a significantly higher percentage of higher-level TIC pattern types and MR-reported ALN-positive and MIP-positive cases (all P<0.001). Specific results are shown in Table 2. Age, menopause status, maximal diameters, distributions, TIC patterns, minimum ADC values, MRI reported-ALN status, and MIP status were potential factors influencing differentiation according to the univariate logistic

Training Cohort
Of all features extracted from the lesions in the primary cohort, six features were selected as potentially effective factors for differentiation and were applied in the Rad-score calculation ( Figure 2). The final computation of the model coefficients led to the following differentiation model for NMEs: Of the six features, the biggest weight was given to the shape feature (Surface Area to Volume Ratio). A significant difference in the Rad-score between benign and malignant NMEs was found in the training cohort (P<0.001). The AUC, sensitivity, and specificity of the radiomics multivariable logistic regression alone for NME differentiation was 0.864 (95%CI: 0.805-0.923), 0.827 (95%CI: 0.770-0.880), and 0.804 (95%CI: 0.696-0.893) ( Figure 3, Table 3). After adding the radiomics analysis into the clinical multivariate regression model, MR-reported ALN status was no longer an independent factor of malignancy. We built a nomogram for the training cohort based on the TIC types and the radiomics signature (Figure 4), the specificity of which was improved from 0.589 (95%CI: 0.464-0.714) in the clinical model to 0.839 (95%CI: 0.750-0.862) in the combined model ( Table 3). The final regression equation and correlation coefficients were calculated. In Table 3, the parameters in detail are reported. Using ROC curve analysis, the optimal cutoff value of the final regression equation was 0.772. Lesions with values below the cutoff value are judged as benign, while those with values exceeding the cutoff value are judged as malignant.

Validation Cohort
In the validation cohort, there was also a significant difference in the Rad-score between benign and malignant NMEs (P<0.001). After adding the Rad-score analysis into the clinical model, the specificity increased from 0.545 (95%CI: 0.364-0.727) to 0.864 (95%CI: 0.682-1.000) ( Table 3).
For the differentiation between benign and malignant NMEs, the calibration curve of the combined model demonstrated excellent agreement between the prediction and real pathologic results in the training cohort as well as the validation cohort ( Figure 4). In clinical medicine, the decision curve analysis for the combined model was developed according to a previous study (28) and is showed in Figure 5. The decision curve demonstrated that if the threshold probability was >19%, the nomogram could add more benefit to the discrimination of benign and malignant NMEs than the clinical model.

Consistency Validation
Based on the comparisons of radiomics feature measurements assessed one month apart by reader 1, the intra-observer agreement was excellent (ICC value=0.936, 95%CI: 0.929 to 0.942). Using the second measurements of the 60 patients assessed by reader 1 and the features extraction of the same data set assessed by reader 2, inter-observer was also excellent (ICC value =0.887, 95%CI: 0.876 to 0.898). Figures 6 and 7 show two cases in detail.

Specificity Changes
Considering the low specificity in the conventional clinical analysis, we conducted an analysis for the false positive (FP) lesions (n=33) and the true negative (TN) lesions (n=45) on the basis of the conventional clinical analysis in the whole cohort (78 benign NMEs). The results showed that compared to the TN lesions, the FP lesions had a significant larger proportion of moderate or marked BPE (P=0.004), plateau or washout type of TIC (P<0.001), and positive MIP sign (P<0.001). Of the 33 FP NMEs, 30 (90.9%) lesions were confirmed as adenosis, and the other 3 lesions were chronic inflammation. In addition, 21 of 33 (63.6%) FP lesions were categorized as malignancy applying the final combined model.

DISCUSSION
In this study, we developed a clinical model that consisted of clinical characteristics, morphologic lesion assessments, the ALN status, TIC assessments on DCE-MRI, and minimum ADC values on DWI to differentiate benign and malignant NMEs. This model showed high sensitivity and low specificity in both the training (0.942, 0.589) and validation (0.940, 0.545) cohorts.
To investigate the added value of the radiomics signature for NME differentiation, we added radiomics features derived from early phase DCE-MRI to the clinical model and built the combined model. The combined model achieved a higher specificity in the training (0.839) and validation (0.864) cohorts.
For the morphologic analysis, we used early phase images after contrast agent injection for NME evaluations because NMEs can be affected and obscured by more pronounced BPEs on the delayed phase images (29). Remarkably, although morphologic assessments, including distribution and internal enhancement patterns, were reported effective in previous studies (13)(14)(15), our study demonstrated that these morphologic features were not independently associated with NME differentiation, which is consistent with the results of a study by Naoko Mori et al. (10). Conversely, this lack of an independent association with morphologic features could be explained by decision-making pitfalls caused by the subjective judgment of visual examinations and by the variance of morphologic proportions contained in different study cohorts. In China, this can happen because the national breast cancer screening program is largely lacking compared with other countries; therefore, the lesions in the cohort of our study had larger sizes and a higher proportion of regional distributions and heterogeneous enhancement patterns. Thus, considering the potential role and subjective nature of morphologic assessments, we drew ROIs covering the whole lesion in each image plane and investigated the performance of the radiomics signatures alone, achieving a high sensitivity (82.7%) and specificity (80.4%). Of the six selected radiomics features, the surface area to volume ratio was given a maximum negative correlation (-0.594); lower ratios indicated a greater likelihood of NME malignancy, which is hard to identify with the human eye. Overall, these results indicated an important role for morphologic assessments in differentiating benign and malignant NMEs. However, it also indicated that histological patterns enrolled in the study may impact on the sensitivity and specificity of the model. The number of lesions in this study is FIGURE 5 | Decision curve analysis of the combined model. The Y-axis demonstrates the net benefit to patients. As indicated in the curve, the net benefit of using the combined model to differentiate benign and malignant NME lesions is greater than when the clinical model is used at a threshold probability of > 0. 19. relatively small, and further research should be undertaken in a large cohort to investigate the impact of different histological patterns on the differentiation performance of the model.
A previous study observed that minimum ADC values potentially suggested the presence of an invasive component in ductal carcinoma in situ (DCIS) (30). In our study, we applied the same approach for malignant component detection. To perform this approach, we assumed that the area with minimum ADC values corresponded to the region with the highest tumor cell density, reflecting malignancy. However, we demonstrated that malignant lesions had significantly lower minimum ADC values than benign lesions. The multivariate analysis indicated that the minimum ADC value was not an independent factor for the discrimination of benign and malignant lesions, suggesting a limited role for DWI. These results are consistent with those of some recent studies (9,31). Naoko Mori et al. reported that kinetic assessments might be more important than the morphologic assessments in differentiating benign from malignant NMEs on the ultrafast DCE-MRI (10). In this study, we employed a similar ultrafast DCE-MRI approach and achieved similar results. Comparatively, malignant lesions tended to have more neovascularization (32). Thus, it is reasonable to set the ROI on the brightest areas of the images during the very early phase after contrast injection to obtain TIC curves. The selection of higher TIC curve types could provide greater detection of malignant components in the lesion enhancements. The TIC type alone gave a higher sensitivity (94.2%) and lower specificity (58.9%) for NME differentiation.
Our results showed that MR-reported ALN alone offered a higher specificity (92.9%) and lower sensitivity (36.6%) than conventional DCE-MRI assessments, which could be explained since less axillary lymphadenopathy was detected on the MRI images of most patients with malignant or benign lesions in this study. However, this situation was not consistent with what is seen in clinical practice. (B) On the ADC map, multiple ROIs are placed to cover the whole area of the lesion. The ADC map shows the minimum ADC value of the ROIs is 1056 ×10 −6 mm 2 /s. (C) After drawing the TIC curves for all ROIs at the brightest part on each slice, the high-level TIC curve type of this lesion is persistent type. (D) Using the ITK-SNAP software, the whole lesion was segmented. Finally, the logistic regression equation of the combined model for this lesion was calculated as 0.669, which was lower than the cut-off value 0.772 and adjudicated as benign lesion, consistent with the pathological results.
The analysis of low specificity showed that moderate or marked BPE, plateau or washout TIC, and MIP positive status may be prone to yield false positive results for NMEs in the conventional clinical analysis. It further indicated the difficulty and complexity of differentiation in clinical practice. Finally, the combined model of clinical features with added radiomics signature features improved the specificity in both the training (0.839) and validation cohorts (0.864). Given the comparable proportion of benign and malignant lesions and the good agreement between observers, the improved performance indicated that the radiomics signature was robust for the differentiation of benign and malignant NME lesions. The nomogram was primarily used to improve personalized diagnostics. The results of our study might suggest that additional radiomics signatures could help improve the specificity of differentiating benign and malignant NME lesions and avoid unnecessary biopsies. However, further studies with larger sample sizes are needed.
There were several limitations in our study. A primary limitation was the retrospective nature of the analysis, making potential selection bias difficult to avoid. Second, most of the patients in our hospital underwent breast MRI scans for two possible indications; preoperative staging for known breast cancer and further scanning for suspicious lesions in high-risk patients. Thus, the proportion of malignant lesions in our cohort was high, and there was a difference in the malignant/benign ratio between the training and validation cohorts. Third, the morphologic assessments and parameter measurements were accomplished by two radiologists using a consensus, and further research is needed to validate the repeatability of inter-and intra-observer. Fourth, the maximal diameters and morphologic assessments were recorded in the early phase to avoid being affected by BPEs; thus, some lesions with progressive enhancements might not have been evaluated accurately. Optimal timing needs to be determined in future studies. The ADC map shows the minimum ADC value of the ROIs is 745 ×10 −6 mm 2 /s. (C) After drawing the TIC curves for all ROIs at the brightest part on each slice, the high-level TIC curve type of this lesion is washout type. (D) Using the ITK-SNAP software, the whole lesion was segmented. Finally, the logistic regression equation of the combined model for this lesion was calculated as 0.989, which was higher than the cut-off value 0.772 and adjudicated as malignant lesion, consistent with the pathological results.
In conclusion, the clinical multivariate regression analysis indicated that TIC patterns and ALN status were independent factors for the differentiation of benign and malignant NME lesions. Our results demonstrated that a radiomics nomogram combining clinical factors with radiomics signatures derived from early phase DCE-MRI could achieve high sensitivity and specificity for NME differentiation. Additional radiomics signatures could be used to improve specificity and avoid unnecessary biopsies. We believe that our model may not substitute but could improve conventional diagnostic workflow. However, a more extensive analysis with large samples is needed.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by The ethics committee of Tongji Hospital. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
YL, ZY, TA, and LX participated in the conception and design of the study. YL, YQ, and CT collected the clinical and imaging data. YL and WL performed the statistical analyses. YL, ZY, XY, YG, TA, and LX coordinated, drafted, revised and finalized the manuscript. All authors contributed to the article and approved the submitted version.