The Impact of Preoperative Radiomics Signature on the Survival of Breast Cancer Patients With Residual Tumors After NAC

Background Residual cancer cells remaining after chemotherapy may have more aggressive behavior that promotes recurrence or metastasis, and which patients would benefit from subsequent additional treatment is controversial. The purpose of our study was to evaluate the prognostic value of the preoperative radiomics features of computed tomography (CT) imaging in breast cancer (BC) patients with residual tumors after neoadjuvant chemotherapy (NAC). Methods Post-NAC CT images were reviewed from 114 patients who had received breast surgery and had residual breast tumors. The association of the 110 radiomics features derived from CT images with 5-year disease-free survival (DFS) was assessed by log-rank test in the training cohort, resulting in 13 prognostic radiomics features. Results We constructed a radiomics signature consisting of four selected features by using least absolute shrinkage and selection operator (LASSO) Cox regression analysis, which performed well in the discrimination with an area under the curve (AUC) of 0.78 (95% CI, 0.67–0.89) and 0.73 (95% CI, 0.59–0.87) in the training and validation cohorts, respectively. Radiomics nomogram, incorporating the radiomics signature with the conventional clinical variables, also performed well in the two cohorts (training cohort: AUC, 0.84; validation cohort: AUC, 0.82). Moreover, we found that the high-risk patients determined by our radiomics nomogram could benefit from postoperative adjuvant chemotherapy, while the low-risk and total patient groups could not. Conclusions Our novel radiomics nomogram is a promising and favorable prognostic biomarker for preoperatively predicting survival outcomes and may aid in clinical decision-making in BC patients with residual tumors after NAC.


INTRODUCTION
Neoadjuvant chemotherapy (NAC) in the management of breast cancer (BC) has become a popular treatment strategy in recent years (1,2). Only a subset of patients will achieve a pathological complete response (pCR) following NAC, defined as absence of invasive cancer in the breast and the axillary lymph nodes (ALNs), with rates varying according to the different subtypes of BC (3,4). The presence of residual tumor following NAC indicates the increased recurrence risk; however, to date, the role of additional postoperative adjuvant chemotherapy for non-pCR patients is not clear, although non-pCR is clearly associated with high recurrence and metastasis (4)(5)(6). Therefore, a more accurate understanding of the molecular and genomic characteristics of tumors will undoubtedly facilitate the development of clinical trials for the treatment of residual diseases (3,7,8). Moreover, identifying high-risk, non-pCR patients using noninvasive approaches for additional treatments is urgently needed.
There is evidence that radiogenomics can define the association between imaging features and genomic phenotypes, which has recently attracted great interest (9,10). To facilitate the use of image features to directly estimate patients' outcomes, "radiomics" has made rapid progress (11). It is now possible to extract quantitative risk variables from traditional computed tomography (CT) images to achieve non-invasive profiling of tumor heterogeneity (11,12). To date, radiomics has made great contributions in the field of cancer and has been widely applied to tumor detection, subgroup identification, treatment response evaluation, and so on. A multiple radiomics features-based signature is often more valuable than a single biomarker, and a recent study has shown that radiomics features from magnetic resonance imaging (MRI) performed well in the prognostic prediction of BC (13)(14)(15). However, to our knowledge, CT images radiomics features-based signatures have not yet been deeply assessed, especially in non-pCR BC patients after NAC.
Therefore, the purpose of this study is to develop and validate a CT-based radiomics signature and nomogram to predict the 5-year disease-free survival (DFS) and response of additional chemotherapy in non-pCR BC patients and then to precisely guide the implementation of postoperative adjuvant chemotherapy.

Patients
Ethical approval for this retrospective study was granted by the ethics committee of Sun Yat-sen University Cancer Center. Four hundred sixty-two consecutive invasive breast cancer patients (mean age, 49 years) who received neoadjuvant chemotherapy (NAC) before surgery between January 2010 and December 2016 were identified. All patients were treated at the Sun Yat-sen University Cancer Center, and the corresponding ethical approval was obtained for this retrospective analysis at our cancer center. The informed consent requirement was waived due to the retrospective nature of the study. The inclusion criteria of our study were as follows: (a) preoperative dynamic contrast enhanced chest CT performed <30 days before surgical resection at our institution; (b) initial unilateral breast malignancy with histologically confirmed invasive breast cancer; (c) residual breast tumor after NAC; (d) a lesion presenting as a mass on CT; (e) no other malignant neoplasm found previously; (f) available clinicopathological characteristics and follow-up data. The exclusion criteria of our study were as follows: (a) tumor lesions that could not be recognized by CT; (b) patients with distant metastatic disease after six-or eight-cycle preoperative chemotherapy; (c) CT images of poor quality and with large artifacts, which cannot therefore be used for analysis. Finally, 114 female patients (mean age, 48 years; range, 30-69 years) were included in this study ( Figure S1).

Clinical Factors and Follow-Up
The potential DFS-related clinical risk factors of the enrolled patients were collected, including age, menopausal status, tumor grade, vascular invasion, estrogen receptor (ER) status, progesterone receptor (PR) status, human epidermal growth factor receptor 2 (HER2) status, Ki-67 expression level, adjuvant treatment after surgery, and so on. Invasive tumors with HER2 scores of 3+ by immunohistochemistry (IHC) were defined as positive, while fluorescence in situ hybridization (FISH) was conducted to determine HER2 amplification for tumors with HER2 scores of 2+ by IHC ( Table S1). The end point of our study was DFS, which was determined as the time from the date of surgery to the date of relapse (event), death, or from the date of surgery to the date that the patient was last known to be free of relapse or death (censored). All enrolled patients had been followed up for at least 3 months after surgery.

CT Image Acquisition and Preparation for Radiomics Analysis
All patients underwent contrast-enhanced CT. The radiomics workflow is shown in Figure 1. Before receiving breastconserving surgery or mastectomy after NAC, all patients underwent contrast-enhanced chest CT with a 64-slice spiral CT scanner (64-slice CT750 HD scan, GE Medical Systems). The acquisition parameters of the CT scan were as follows: 120 kV, 200 effective mAs, a rotation time of 0.4 or 0.5 s, detector collimation of 64 × 0.625 mm, a matrix of 512 × 512, and a thin layer reconstruction layer thickness of 1.25 mm. After conventional nonenhanced CT scanning, 85-100 ml of contrast agent (iopamidol, 300 mg i/ml, Bracco) was intravenously administered, and dynamically contrast-enhanced CT scans were performed at a speed of 3.5 ml/s, followed by 30 ml saline flushing. Arterial-and vein-phase images were obtained at 30 and 60 s, respectively. Tumor regions of interest (ROI) were semi-automatically segmented in vein-phase images three-dimensionally using 3D Slicer software. Radiomics features were extracted from enhanced CT images by using pyradiomics. Intraclass correlation coefficients (ICCs) were used to evaluate the repeatability of radiomics features extraction within and among observers. Two experienced radiologists analyzed the repeatability between observers for determining ROI segmentation-based radiomics features (readers 1 and 2 with 5 and 8 years of clinical imaging reading experience in chest CT), reader1 and reader 2 repeated features extraction independently on 30 randomly chosen patients. The radiologists did not know anything about the clinical or pathological data of patients but were told that the patients had breast carcinoma.

Development and Validation of the CT-Based Radiomics Signature and Nomogram
Radiomics features were extracted from the tumor regions of interest (ROIs) drawn from CT images by the radiologists. The features were computed using the pyradiomics package from the Python platform. In total, 110 features in seven distinct categories were extracted. The features were grouped into First Order Statistics (19 features), Shape-based (16 features), Gray Level Cooccurence Matrix (24 features), Gray Level Run Length Matrix (16 features), Gray Level Size Zone Matrix (16 features), Neighboring Gray Tone Difference Matrix (5 features), and Gray Level Dependence Matrix (14 features). We randomly divided patients into two groups: a training cohort (n = 76) and a validation cohort (n = 38). The characteristics of the patients in the training cohort and the validation cohort were compared by variance for continuous variables and chi-squared test or Fisher's exact test for categorical variables. In multivariate analysis, the number of events should be at least 10-fold larger than the number of covariates included according to the Harrell guideline. To solve this problem in high-dimensional data, we conducted least absolute shrinkage and selection operator (LASSO) Cox regression analysis to choose the most useful prognostic radiomics features in the training data cohort. The radiomics signature was constructed based on the selected imaging features, and the risk score of each patient was calculated using a linear combination of the selected features weighted by their respective coefficients. Among the enrolled patients, univariate Cox regression analysis was first performed to screen DFS-associated CT image-based features (all features were standardized using Z-score) in the training cohort and then validated in the validation cohort. The patients were then classified into high-or low-risk groups according to the radiomics signature, the threshold of which was confirmed with receiver operating characteristic (ROC) curve analysis. Kaplan-Meier curves and log-rank tests were used to compare survival between the high-and low-risk groups. In the training cohort, a univariate Cox proportional hazards model was used to verify the effects of clinicopathological variables (age, menopausal status, initial tumor status, initial node status, initial ER status, initial PR status, initial HER2 status, initial Ki-67 expression level, tumor size at surgery, grade at surgery, vascular invasion at surgery, ALN status at surgery, ER status at  surgery, PR status at surgery, HER2 status at surgery, Ki-67 expression at surgery, adjuvant chemotherapy, and adjuvant endocrine therapy) and the radiomics signature on DFS.
Variables determined as significant in the univariate Cox proportional hazard model (P < 0.05) were included in the multivariate Cox proportional hazard model. To prove the value of the radiomics signature, a radiomics nomogram was developed in the training cohort and then evaluated in the validation cohort. The radiomics nomogram combined the radiomics signature and various clinical risk factors based on the multivariate Cox analysis with stepwise selection. The performance of the radiomics nomogram was analyzed by calibration curves. The area under the curve (AUC) between the predicted probability and the actual result was computed to evaluate the predictive ability and discriminability of the model (1.0 indicates a perfect discrimination; 0.5 indicates no better discrimination than random chance). Decision curve analysis (DCA) was also used to evaluate the clinical usefulness of the radiomics nomogram by quantifying the net benefits at different threshold probabilities.

Statistical Analysis
All statistical analyses were conducted with R statistical software (version 3.5.3; https://www.r-project.org/). The R packages used in our study were as follows: "glmnet," "rms," "Hmisc," "pROC," "survival," and "dca.R." The LASSO Cox regression analysis was applied with L1-penalty parameter tuning (l), performed by 10-fold cross-validation based on the minimum criteria. A conventional two-tailed P value <0.05 was determined to be significant.

Clinical Characteristics
A total of 114 patients who underwent breast-conserving surgery or mastectomy after NAC were included in our study. There were 36 events (10 local-regional recurrence, 4 contralateral breast, and 22 distant metastasis) during a mean follow-up period of 44.4 months (range, 5-93 months). The mean time to disease event was 21.2 months (range, 5-78 months). Disease events occurred in three patients during a follow-up period of the first 6 months, which might have resulted from residual disease. The patients' clinical characteristics are summarized in Table S1. The baseline clinical characteristics of patients were similar among the two cohorts (P < 0.05). Among patients enrolled, 37.7% were treated with postoperation chemotherapy, while all patients received adjuvant radiotherapy. The intraobserver agreement of the radiomics features extraction between the two readers was excellent [the mean ICC value was 0.982 (range, 0.781-0.999)]. Therefore, all the results are based on the first radiologist's measurements.

Development of Radiomics Signature-Based Model
The 110 radiomics features were extracted from the CT images of enrolled patients. From these candidate features, we selected 13 potential DFS-associated predictors from the 110 features identified in the training cohort using univariate Cox proportional regression analysis. In addition, we conducted colinear analysis and discovered colinearity in some predictors, which may affect the accuracy of the traditional Cox regression analysis ( Figure S2). To minimize the colinearity between variables, we applied a Cox regression model combined with the LASSO algorithm to further eliminate nine features, yielding a final four-feature panel ( Figure S3). We then calculated the risk score of the radiomics signature for every BC patient based on the values of the final four remaining features weighted by their regression coefficients: risk score = original_firstorder_Kurtosis* (−5.570173e-03)+original_glcm_Correlation*(1.004042e+00) +original_glcm_MaximumProbability*(−9.648413e-01) +original_glszm_LargeAreaEmphasis*(1.643472e-08). The radiomics signature showed an AUC of 0.78 (95% CI, 0.67-0.89) in the training cohort and 0.73 (95% CI, 0.59-0.87) in the validation cohort (Figures 2A, B). The optimum cutoff value was 0.185 generated by the ROC curve analysis. Patients were stratified into a low-risk group (radiomics signature <0.185) and a high-risk group (radiomics signature ≥0.185) based on this cutoff value. Furthermore, Kaplan-Meier curves showed statistically significant difference between the two groups in the training (P < 0.001) and validation cohorts (P = 0.003) ( Figures 3A, B).

Performance and Validation of the Radiomics Nomogram for Individualized Prediction
In the training cohort, the results of the univariate analysis based on the training cohort are shown in Table 1 Table 2). A radiomics nomogram was developed based on the radiomics signature and clinical risk factors (Ki-67 expression at surgery and ALN status at surgery) for predicting the risk of disease in patients with non-pCR BC ( Figure 4A). Calibration curves ( Figures 4B, C) showed good performance in the training and validation cohorts. Compared to the radiomics signature, the radiomics nomogram showed a better discrimination performance in the training (AUC, 0.84; 95% CI, 0.76-0.92) and validation cohorts (AUC, 0.82; 95% CI, 0.74-0.90) ( Figures  4D, E). The DCA indicated that when the threshold probability for a patient was not between 19 and 27%, the nomogram showed better net benefit than the "treat all" or "treat none" strategy. The DCA for the nomogram is presented in Figure 5.

Association With Additional Chemotherapy and Clinical Outcome
For all total patients, postoperative chemotherapy was not associated with 5 Figure 6A); however, after stratification by our generated radiomics nomogram, 45.8% improvement of 5year DFS were observed by additional chemotherapy in the high-risk group (HR 4.264, 95% CI 1.321-7.248; P = 0.008; Figure 6B), whereas no significant improvement of the 5-year DFS in the low-risk group (HR 0.565, 95% CI 0.281-3.132; P = 0.071; Figure 6C).

DISCUSSION
Due to its non-invasive advantages, medical imaging is often used in disease diagnosis, treatment, and the dynamic evaluation of therapeutic effects, especially for patients with cancer (16)(17)(18)(19). Currently, traditional image analysis is qualitative, and there are large subjective differences that limit its clinical value (9, 20,

A B
Training cohort Validation cohort   21). In recent years, medical imaging has achieved rapid development, especially with the advent of radiomics, which has enabled high-throughput information extraction from imaging features, and it is possible to quantify differences between tissues that cannot be observed by the naked eye (12,20,22). In this study, we used high-throughput methods to extract radiomics features and develop a radiomics signature, which can be used to predict DFS and the response of postoperation chemotherapy in patients with residual breast tumors after NAC. Using our prognostic radiomics signature classifier, the high-risk group exhibited a worse 5-year DFS rate (48%) than those in the low-risk group (89%). We also proved that our constructed radiomics nomogram, which combined the radiomic signature and clinical risk factors, has a better prediction performance than the radiomics signature alone. However, the benefit of additional chemotherapy after surgery in these patients remains unclear. In all enrolled patients, additional chemotherapy was not associated with 5-year DFS, which is consistent with previous studies. However, our study showed that high-risk group patients could get significant benefit from additional chemotherapy, whereas patients stratified as low-risk did not get any benefit. We demonstrated that the radiomics signature, Ki-67 expression at surgery, and ALN status at surgery were outstanding clinical predictors of DFS in patients with residual breast tumors after NAC. The Ki-67 expression level in residual tumor tissues is a significant risk factor and a prognostic predictor of the chemotherapy response in patients who have residual tumors after the administration of NAC. Our results concur well with those of published studies (23)(24)(25)(26)(27). ALN status at surgery has been demonstrated to be a risk factor for 5-year DFS in non-pCR patients with BC, which is similar to the results of previous studies as well (28)(29)(30)(31). Interestingly, four DFSrelated radiomics features were selected in the current study, including one first-order features, one GLSZM features, and two GLCM features. The well-known radiomics features, entropy, was not included. Kurtosis, selected from one of the first-order features, was negatively associated with the risk of disease in this study. Many recent studies have particularly emphasized the significance of Kurtosis in colorectal, pancreatic, and breast cancers (16,32,33). The large area emphasis belongs to one of the features of GLSZM and is very suitable for quantifying the texture and heterogeneity of tumors because it considers the interaction between adjacent pixels (34,35). Correlation feature shows the linear dependency of gray level values, and the maximum probability is the appearance of the most predominant pair of neighboring intensity values obtained from the GLCM features. These GLCM features reflect the texture heterogeneity of tumors in different aspects for they have different mathematical definitions.
Accordingly, we developed a nomogram based on these radiomics features for prediction of the DFS status and management of additional treatment strategies for each non-pCR patient with BC. The parameters of the nomogram can be easily obtained. For example, both Ki-67 expression at surgery and ALN status at surgery are conventional predictive factors and components of the TNM system in BC patients. In addition, the radiomics features could be extracted from breast tumor image via engineered hard-coded feature algorithms. In summary, our study demonstrated that the nomogram may  serve as either a scoring system or a useful tool for chemotherapy response and prognostic prediction in non-pCR patients with BC, thus aiding physicians to rapidly evaluate the risk of relapse via a simple calculation method in the clinic. Overall, our study has two strengths. First, this is the first study (to our knowledge) conducted to predict survival and postoperative chemotherapy response in invasive BC patients who received NAC and surgery using a radiomics signature. This study found that the radiomics nomogram can predict BC patient survival with a higher C-index and better calibration than the radiomics signature, with a higher C-index and better calibration. Second, because all radiomics features had different ranges, we standardized radiomics features values prior to the LASSO analysis, which achieved better predictive efficacy of the radiomics features. However, there were also several limitations in our study. First, our study included a small number of enrolled  patients. Further studies in larger populations are needed, although we used the LASSO Cox method (10-fold crossvalidation) to prevent overfitting. Second, a larger multicenter database that combines genomic and radiomics parameters has the potential to achieve a better performance of our current radiomics nomogram. Third, CT-based radiomics signature were used in the study, but the contrast resolution of the soft tissues was low in CT than MR imaging. Finally, we profiled the ROIs on the whole tumor area and calculated the radiomics predictors semi-automatically, which were time-consuming and laborious tasks. However, we believe that our research, mainly as a proofof-concept study, has demonstrated the potential of the use of radiomics signatures in clinical practice. With the advent of commercially available software that offers automatic segmentation of tumors and automatic derivation of radiomics predictors, radiomics signatures are bound to be applied to daily clinical practice in the near future.
In conclusion, we observed a predictive radiomics signature might be a potential biomarker of risk stratification for DFS in invasive BC patients with non-pCR after NAC. Additionally, this study presents a radiomics nomogram that combined the radiomics signature and clinicopathological findings can assist in preoperative risk stratification, individualized predictions of recurrence and evaluate whether non-pCR patients will benefit from adjuvant chemotherapy after surgery. Therefore, our radiomics nomogram model may be potentially useful for personalized medicine and subsequently customize treatment strategies for BC patients with residual tumors after NAC.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/ Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the ethics committee of Sun Yat-sen University Cancer Center. The patients/participants provided their written informed consent to participate in this study.