DGM-CM6: A New Model to Predict Distant Recurrence Risk in Operable Endocrine-Responsive Breast Cancer

To investigate the prognostic value of DGM-CM6 (Distant Genetic Model-Clinical variable Model 6) for endocrine-responsive breast cancer (ERBC) patients, we analyzed 752 operable breast cancer patients treated in a Taiwan cancer center from 2005 to 2014. Among them, 490 ERBC patients (identified by the PAM50 or immunohistochemistry method) were classified by DGM-CM6 into low- and high-risk groups (cutoff <33 and ≥33, respectively). Significant differences were observed between the DGM-CM6 low- and high-risk groups for 10-year distant recurrence-free survival (DRFS) in both lymph node (LN)- (P < 0.05) and LN+ patients (P < 0.05). Multivariate analysis confirmed the independent strength of DGM-CM6 for the prediction of high- vs. low- risk groups for DRFS (P < 0.0001, HR: 6.76, 95% CI, 1.8–25.42) and overall survival (P = 0.01, HR: 6.06, 95% CI:1.55–23.47), respectively. In summary, DGM-CM6 may be used to classify low- and high-risk groups for 10-year distant recurrence in both LN- and LN+ ERBC patients in the Asian population. A large scale clinical trial is warranted.


HIGHLIGHTS
-DGM-CM6 model is capable of predicting long-term distant recurrence risk in both node-negative and positive endocrine responsive breast cancer (ERBC) patients. -Low-risk ERBC patients identified by DGM-CM6 panel may benefit more from endocrine therapy since the risk of distant relapse at 10 years was <5%. -High-risk node-positive patients identified by our model had received adjuvant chemotherapy already, and thus receiving prolonged endocrine treatment or adding other agents such as CDK 4/6 inhibitors could be considered in a novel clinical trial as these patients had a 10-year accumulated distant relapse risk of about 20%. -No prognostic difference was observed between Luminal B and Luminal A in node-negative ERBC patients in our cohort. One of the potential explanations could be that the PAM50 panel might not be optimal for the Asian population.

INTRODUCTION
Although endocrine-responsive breast cancer patients (ERBC) generally have a better outcome than human epidermal growth factor receptor-2 (HER2)-enriched and triple negative breast cancer patients (TNBC), the risk of long-term disease recurrence is unpredictable (1). To maximize the treatment effects, adjuvant chemotherapy has been recommended for high-risk ERCB patients (2). However, it is a challenge to clearly separate highand low-risk groups for distance recurrence (DR) within the ERCB population, due to the overall better outcomes compared to other subtypes. Therefore, it is critically important to develop models that can accurately predict the group of patients who will benefit from endocrine therapy or chemotherapy, so that all patients can be administered appropriate treatment. Molecular biomarkers have been very helpful for predicting recurrence-free survival and overall survival in breast cancer patients. Several commercial multi-gene assays have been successfully applied in clinical practice, including 21-gene (Oncotype Dx) recurrence score [RS, (3)], MammaPrint (4), EndoPredict 12 gene (5), PAM50 risk-of-recurrence [ROR, (6)]. However, the performance of these panels has not been found to be optimal in predicting the risk of distant recurrence Abbreviations: DGM-CM6, distant genetic model-clinical variable model 6; LVI, lymphovascular invasion; BCS, breast-conserving surgery; IHC, immunohistochemistry; ER, estrogen receptor; PR, progesterone Receptor; HER2, Human epidermal growth factor receptor 2. in node-positive ERBC patients (7). Therefore, the treatment remains ambiguous, especially when these patients have axillary lymph node (LN) metastasis (8)(9)(10).
Although the MINDACT trial claimed that MammaPrint could predict node-positive patients who could forgo adjuvant chemotherapy, only 21% of node-positive patients were enrolled in the trial and 52% of them had low genomic risk. Therefore, the results of this trial should be cautiously interpreted considering the small proportion of node-positive patients in the entire study population and a potentially low benefit of 1.5% DR-free survival improvement (11,12). PAM50 molecular subtypes are closely associated with LN metastasis; however, almost all node-positive patients were classified as high risk (13). Although Luminal A and B are both ERBC, significant differences in clinical outcomes and chemotherapy sensitivity have been reported in several studies (14,15). Also, Kim et al. reported that the subtype discordant rate between Immunohistochemistry (IHC) and PAM50-based classification was almost 40% (16).
Considering the above limitations, the currently available models for the prediction of long-term DR risk are unsatisfactory in operable ERBC patients, including those who received adjuvant chemotherapy. Therefore, prognostic biomarkers identified from the integration of molecular biomarkers and clinical variables might be more accurate to predict recurrence. Here, we present a previously developed prognostic panel in an Asian cohort study to validate the predictive value for 10-year distant recurrence-free survival (DRFS) in ERBC patients.

Study Design and Data Description
The study design is shown in Figure 1. A cohort of 752 breast cancer patients, treated at a free-standing Cancer Center in Taiwan from 2005 to 2014, was included in our retrospective study. The included patients were pN0-2 breast cancer patients who had undergone primary surgery in the form of either mastectomy or breast conserving surgery (BCS). Patients who had pre-operative chemotherapy and cN3, cT4, and/or cM1 disease were excluded. The primary study endpoint was 10-year DRFS, which was defined as the interval from breast cancer surgery until the development of distant recurrence (DR) or death from any cause (17). We defined DR as the spread of breast cancer to any part of the body apart from local and/or regional recurrence. Secondary endpoint was overall survival (OS). The protocol and informed consent documents were reviewed and approved by the institutional review board (IRB) of the hospital (IRB no. 20131001A). The baseline characteristics of the 752 patients are listed in Table S1.

Identification of ERBC Patients by IHC Staining and Microarray Profiling
ERBC patients were identified by both IHC and microarray profiling from fresh-frozen primary tumor samples. HER2 receptor and/or hormone receptor status was evaluated according to guidelines (18). Patients with ER/PR+, HER2-, grade 1-2 tumors were grouped together as IHC Luminal A subtype; while patients with ER/PR+, HER2-, grade 3 tumors were grouped as IHC Luminal B; and ER/PR+, HER2+ were grouped as Luminal-HER2 (19,20). The details of the RNA extraction process used for microarray profiling in our study have been previously reported (21). Specifically, raw CEL files from Affymetrix U133 Plus 2.0 platform were pre-processed using the robust multi-array average method in the affy package of R software (22). Quantile normalization was performed to reduce potential systematic biases. Each patient was assigned to an intrinsic molecular subtype of breast cancer (Luminal A, Luminal B, HER2-enriched, Basal-like, and Normal-like) by PAM50 method using the genefu package of R software (23,24). The pool of Luminal A/B patients from both IHC (n = 490) and PAM50 method (n = 404) was defined as ERBC patients (n = 499) for down streaming analysis (Table S2).

Statistical Analyses
The detailed procedure of developing the DGM-CM6 model from the training set (n = 112) and testing set (n = 46) has been published in our previous study (25). The recurrence index for distant recurrence (RI-DR) score for each patient was computed by the DGM-CM6 model. Patients with DGM-CM6 (RI-DR) scores ≥33 and <33 were defined as high-and low-risk groups for DR, respectively (25).
Wilcoxon rank sum test was used to evaluate the association between DGM-CM6 score vs. IHC-and PAM50 defined Luminal A/B groups. Chi-square test was used to test the association between the risk groups and clinical categorical variables. Kaplan-Meier survival analysis and the log-rank test were used to compare the differences in DRFS and OS between high-and low-risk patients. These survival comparisons were stratified by LN negative (LN-) and positive (LN+) status, respectively.  Figure 2B).

Prognostic Comparison Between RI-DR Score Defined Risk Group for DR and OS
The cumulative incidence of survival differences between RI-DR defined high-(≥33) and low-(<33) risk groups is shown in b. It can be observed that patients from the high-risk group exhibited significantly poorer prognosis regardless of LN status in the IHC defined ERBC population. Multivariate Cox regression analysis further confirmed that RI-DR score could independently predict high-and low-risk groups for DRFS and OS in IHC defined ERBC population after adjustment for clinical confounders such as age, LN status, stage, grade, and treatment pattern. As shown in Table 2, the prognosis of high-risk group was found to be significantly worse than the low risk group for DRFS (HR=6.76, 95% CI: 1.8-25.42, p = 0.005) and OS (HR=6.06, 95% CI: 1.55-23.74, p = 0.01). Also, no association was observed between chemotherapy and risk groups (DRFS p = 0.163; OS: p = 0.195), which implies that our model can predict DRFS and OS of patients regardless of whether they received chemotherapy or not (Table S3).
We further compared the DR and OS risks of our risk groups with PAM50 Luminal A/B groups (Figure 4). Consistent with the IHC cohort results, our RI-DR score could separate patients into high-and low-DR risk groups in case of both LN negative (p < 0.0001) and positive patients (p = 0.019). With regards to OS our score could separate patients into high-risk and low-risk groups only in LN negative patients (p = 0.0047).

DISCUSSION
Optimal treatment decisions for patients with nodal involvement remain an important goal yet a significant challenge in ERBC patients (11,26  demonstrated that our DGM-CM6 panel could independently predict the prognosis of ERBC patients for both DRFS and OS after adjustment for clinical confounders including molecular subtypes, LN status, and other clinical factors, regardless of whether the patients received chemotherapy or not ( Table 2). Of note, our results also highlighted that within the ERBC population, not all ER/PR+ HER2-samples are Luminallike since basal-like and HER2-enriched samples could also be identified (Table S2). This suggests that our model can successfully predict the DR risk in ERBC patients based on the status of three IHC biomarkers what might be considerably cost-effective. From a clinical perspective, patients who experience no recurrence after 5 years of endocrine therapy and have a sufficiently low risk should not be recommended an extension of the endocrine therapy. Therefore, we hypothesized that the patients in our study who were predicted by the DGM-CM6 panel to be low risk in the LN-ERBC population may benefit more from endocrine therapy than chemotherapy, since the <5% risk of distant relapse at 10 years implies they may safely avoid adjuvant chemotherapy and prolonged endocrine therapy (current cohort was treated with 5-year endocrine therapy). Whereas, patients with high risk in the LN+ ERBC population had received adjuvant chemotherapy already, and thus received prolonged endocrine treatment and were also enrolled in a novel clinical trial as RI-DR-high-risk patients as they had a 10-year accumulated distant relapse risk of about 20% (Figure 3).
Intrinsic subtypes Luminal A and B defined by PAM50 have been well-known to behave differently with respect to clinical outcome and treatment sensitivities (27). Luminal B is more aggressive with more propensity to develop relapse and resistance to endocrine therapy than Luminal A (28,29). Intrinsic subtypes could provide precise information for recurrence risk prediction in early breast cancer (30). However, we found that there was no prognostic difference between Luminal B and Luminal A in LN-negative ERBC patients in our cohort, while our DGM-CM6 panel performed better in separating high and low-risk groups (Figure 4). One of the potential explanations could be that the PAM50 panel may not be optimal for the Asian population, especially for low-risk ERBC patients. As a result, LN-patients in our study had a good prognosis even if they were classified as luminal B by PAM50. This observation is consistent with those of previous studies that Asian women had significantly reduced relative odds of other PAM50 subtypes vs. Luminal A in the prediction of short and long-term prognostic outcomes (32,33). Moreover, intrinsic Luminal A and Luminal B subtypes can only be derived from microarray-based data, and thus are commercially expensive. Furthermore, increasing evidence about the discordant results between PAM50 based intrinsic subtypes and IHC based subtypes has been reported (16,34,35). Consequently, we validated the predictive risk value of the DGM-CM6 model in both IHC and intrinsic subtype cohorts in order to avoid the discordance issue among different classifiers. The strength of our panel is that it has prognostic value in both IHC-and microarray-based data, thus demonstrating the clinical utility of the DGM-CM6 in a practical setting.
However, some potential limitations of this study need to be noted. Firstly, we included chemo-treated patients in our study, which could possibly lower the recurrence risk in these patients. However, our multivariate Cox regression analysis for DRFS and OS showed that there was no association between chemotherapy and DGM-CM6 risk groups. This result makes sense since previous studies have shown that adjuvant chemotherapy could not stop the development of late recurrence in ERBC patients, especially in HER2-negative tumors (36,37). Secondly, menopause status should be discussed carefully, but we were unable to obtain well-documented menstruation information in this retrospective study. Thirdly, it would be much more interesting to have the PAM50-based ROR scores as the control panel in our study. However, the PAM50 intrinsic subtype classification could also have more prognostic value than pathological characteristics (38). In this large cohort study, multivariate analysis showed that both of intrinsic subtypes and ROR risk classification yielded strong prognostic information in early-stage breast cancer. The final limitation of our study is that the current cutoff of DGM-CM6 score (≥33) may not be suitable if we attempt to use it for all specific categories of patients (i.e., different subtypes, LN+ numbers). Nevertheless, tumor recurrence and treatment outcome are a product of complex interactions between tumor subtypes, immune system, the status of lymph nodes, tumorstroma interactions, and do not depend solely on luminal A or luminal B type. Additionally, some immune therapies may play a role in the outcome and recurrence of the disease. Consequently, it is necessary to consider the role of the immune system, especially the non-specific properties and role of natural killer (NK) cells in lymph nodes, which has been found to be significant in previously studies (38)(39)(40). Therefore, standardization of a biomarker cutoff applicable for patients of all categories would not be realistic unless all training/testing/validating sets could be unified and wellbalanced for all characteristics.

CONCLUSIONS
In summary, this study demonstrated the prognostic value of the DGM-CM6 panel for making treatment decisions in ERBC women, regardless of LN status. Importantly, our panel consistently showed good performance in both IHCand microarray-derived ERBC candidates, thus solving the discordance issue reported by other studies. Finally, as far as we know, DGM-CM6 is a new edition of the first generation of multi-gene expression predictive model developed for Asian breast cancer patients which combined genome and clinicalpathological information.

DATA AVAILABILITY STATEMENT
The datasets during and/or analyzed during the current study available from the corresponding author on reasonable request.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by The protocol and informed consent documents were reviewed and approved by the institutional review board (IRB) of the Koo Foundation Sun Yat-Sen Cancer Center in Taipei, Taiwan (IRB no. 20131001A). The patients/participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.

AUTHOR CONTRIBUTIONS
LL contributed to project design and wrote the manuscript. X-JW and Y-YM contributed to results interpretation. YZ and SC led the project, provided project design guidance and prepared the manuscript. All authors read and approved the manuscript.