Comparison of GenesWell BCT Score With Oncotype DX Recurrence Score for Risk Classification in Asian Women With Hormone Receptor-Positive, HER2-Negative Early Breast Cancer

Introduction: The GenesWell Breast Cancer Test (BCT) is a recently developed multigene assay that predicts the risk of distant recurrence in patients with early breast cancer. Here, we analyzed the concordance of the BCT score with the Oncotype DX recurrence score (RS) for risk stratification in Asian patients with pN0-N1, hormone receptor-positive, human epidermal growth factor receptor 2 (HER2)-negative breast cancer. Methods: Formalin-fixed, paraffin-embedded breast cancer tissues previously analyzed using the Oncotype DX test were assessed using the GenesWell BCT test. The risk stratification by the two tests was then compared. Results: A total of 771 patients from five institutions in Korea were analyzed. According to the BCT score, 527 (68.4%) patients were classified as low risk, and 244 (31.6%) as high risk. Meanwhile, 134 (17.4%), 516 (66.9%), and 121 (15.7%) patients were categorized into the low-, intermediate-, and high-risk groups, respectively, according to the RS ranges used in the TAILORx. The BCT high-risk group was significantly associated with advanced lymph node status, whereas no association between RS risk groups and nodal status was observed. The concordance between the two risk stratification methods in the overall population was 71.9% when the RS low-risk, and intermediate-risk groups were combined into one group. However, poor concordance was observed in patients aged ≤50 years and in those with lymph node-positive breast cancer. Conclusions: The concordance between the BCT score and RS was low in women aged ≤50 years or with lymph node-positive breast cancer. Further studies are necessary to identify more accurate tests for predicting prognosis and chemotherapy benefit in this subpopulation.


INTRODUCTION
Several multigene expression prognostic assays have been developed to overcome the limitations of clinical variables such as tumor size and nodal status for predicting prognosis in breast cancer (1). These assays are used to predict the risk of recurrence or distant metastasis after surgery and adjuvant hormone therapy in hormone receptor-positive early breast cancer to help treatment decisions regarding chemotherapy. MammaPrint (2) and Oncotype DX (3) are the first generation molecular prognostic assays; additional assays such as Prosigna (4)(5)(6) and EndoPredict (7) were developed later.
Oncotype DX (Genomic Health, Redwood City, CA, USA) is the most widely used multigene assay (3); it uses quantitative reverse transcription-polymerase chain reaction (qRT-PCR) to measure the expression of 21 genes in formalin-fixed, paraffinembedded (FFPE) tissues. The Oncotype DX recurrence score (RS) also predicts the benefit of adding chemotherapy to hormone therapy in estrogen receptor (ER)-positive breast cancer (8,9). Moreover, RS results are currently included in clinical guidelines for treatment decisions in early breast cancer (10)(11)(12). The American Joint Committee on Cancer eighth edition cancer staging system was recently revised to include this score for prognosis in breast cancer (13).
However, recent studies showed that other prognostic scores such as PAM50-based Prosigna risk of recurrence (ROR) score (6) and EPclin by EndoPredict (14) are more accurate than Oncotype DX RS for predicting the risk of distant recurrence in endocrine-treated postmenopausal patients with ER-positive breast cancer. Comparison of the prognostic value of six multigene signatures, including Clinical Treatment Score, four immunohistochemical markers (IHC4), RS, ROR, Breast Cancer Index (BCI), and EPclin in 774 postmenopausal women with ER-positive, human epidermal growth factor receptor 2 (HER2)negative breast cancer also demonstrated that ROR, BCI, and EPclin are more prognostic for overall and late distant recurrence than RS in patients with lymph node-negative breast cancer (15). However, studies comparing Oncotype DX and other assays were performed in Western populations, and the results in Asian patients with breast cancer remain unclear.
Asian breast cancer differs from Western breast cancer in terms of age-specific incidence rates (16)(17)(18). Approximately half of breast cancer patients (peak age: 45-50 years) are premenopausal in Asian countries, whereas 15-30% of Western breast cancer (peak age: 55-60 years) are premenopausal (19)(20)(21). In addition, distinct biological features of Asian breast cancer include higher prevalence of luminal B subtype, more frequent TP53 mutation, and more active immune microenvironment, suggesting the needs for inclusion of more Asian women in clinical trials to unravel the ethnic difference of breast cancer (21,22). However, most genomic algorithms for use in breast cancer tests are based on postmenopausal women in Western countries, which raises concerns regarding their prognostic or predictive value in Asian, or young breast cancer patients. Notably, recent data from the Trial Assigning Individualized Options for Treatment (TAILORx) (23) showed that there is no chemotherapy benefit in patients aged >50 years with hormone receptor-positive, HER2-negative, lymph node-negative breast cancer with a RS of 11-25, while those aged ≤50 years with a RS of 16-25 may benefit from chemotherapy. The trial results suggested that the predictive value of the RS for chemotherapy benefit or "number needed to treat (NNT)" can be different in Asian breast cancer patients, as this population includes a greater number of patients aged ≤50 years. The absolute risk reduction (ARR) and NNT for a RS of 21-25 was 6.5 and 15.4, while it was 1.6 and 62.5 for a RS of 16-20 (23), respectively. Meanwhile, the ARR and NNT for a RS ≥26 was 25.0 and 4.0, respectively (24). A recent study showed that tailored therapy based on Oncotype DX results could result in a net cost increase in initial care of American breast cancer if women aged ≤50 years with tumors with RS of 16-25 all chose to receive chemotherapy (25).
The GenesWell Breast Cancer Test (BCT) (Gencurix, Inc., Seoul, Korea) is a molecular prognostic assay that predicts the risk of 10-year distant metastasis in patients with pathologic N0 or N1 status (pN0-N1), hormone receptor-positive, HER2negative breast cancer (26). This test is a qRT-PCR-based assay that measures the relative expression of six prognostic genes and two clinical variables using FFPE tumor tissues similar to the Oncotype DX. The ability of this assay to predict the chemotherapy benefit was also recently demonstrated in Asian breast cancer patients (27). Here, we aimed to assess the agreement in risk classification between the BCT score and the RS in a large sample of Asian breast cancer patients from multiple institutions.

Patients and Tissue Samples
FFPE tumor blocks were obtained from patients meeting the following criteria: with hormone receptor-positive early breast cancer, underwent curative resection of the primary tumor at any of the five institutions (Samsung Medical Center, Asan Medical Center, Korea University Guro Hospital, Gangnam Severance Hospital in Seoul, and National Cancer Institute in Gyeonggido) in Korea between 2010 and 2017, and with a reportable RS. FFPE tumor tissues not eligible for the GenesWell BCT test or cases without sufficient tumor or clinical information were excluded. Hormone receptors (ER or progesterone receptor [PR]) and HER2 status were determined at local laboratories. The staining of ER or PR by immunohistochemistry (IHC) was scored using the semi-quantitative Allred score (AS) with a maximum score of 8, and AS >2 was considered as positive as described previously (28,29). HER2 status was measured using the IHC, fluorescence in situ hybridization (FISH), or silver-enhanced in situ hybridization (SISH). According to the American Society of Clinical Oncology/College of American Pathologists guidelines, HER2 positivity was defined as an intensity of 3+ by IHC or as gene amplification ratio of ≥2.0 or average HER2 copy number ≥6 by FISH or SISH (30).

Oncotype DX and BCT Tests
Samples were delivered to Genomic Health for Oncotype DX testing prior to the study. Tissue samples were prepared following the pathology guidelines of Oncotype DX. The RS results were determined by Genomic Health, as previously described (3).
Samples previously analyzed using the Oncotype DX test were used for the GenesWell BCT test. RNA was extracted from FFPE tissues, and samples containing sufficient residual RNA were subjected to qRT-PCR as previously described (26). The BCT score was calculated using two clinical variables (tumor size and nodal status) in combination with the relative expression of the six prognostic genes (UBE2C, TOP2A, RRM2, FOXM1, MKI67, and BTN3A2) (26). The expression of ESR1, PGR, and ERBB2 was also quantified relative to the three reference genes (CTBP1, CUL1, and UBQLN1).

Categorization of Risk Groups
Patients were categorized into BCT high-risk and low-risk groups according to the BCT scoring criteria reported previously (26). Briefly, patients with a BCT score <4 were classified as low risk, and those with a BCT score ≥4 were classified as high risk. For the Oncotype DX, two different RS ranges were used to classify patients. First, patients were grouped into low-risk (RS <18), intermediate-risk , and high-risk (RS ≥31) groups using the originally validated cut-off (called clinical cut-off) (3). Second, patients were classified according to the RS ranges used in the TAILORx (called TAILORx cut-off) as low-risk (RS <11), intermediate-risk , and high-risk (RS ≥26) groups (24,31). Clinical risk was determined using the modified version of Adjuvant! Online as reported in the Microarray in Node-Negative Disease May Avoid Chemotherapy (MINDACT) trial as previously described (27).

Statistical Analysis
The association between clinicopathological parameters and the BCT score or the RS was analyzed using the Chi-square test. Chi-square test was also used to compare the distribution of each score between the subgroups. The Jonckheere-Terpstra test was used to determine trends in the association between gene expression and risk scores (32,33). Differences were considered statistically significant at P < 0.05. All statistical analyses were performed using R 3.2.0 (http://r-project.org).

Patient Characteristics
The GenesWell BCT test was used to analyze 795 FFPE tissue samples from patients with pN0-N1, hormone receptorpositive, HER2-negative breast cancer with available RS results, and the BCT score was calculated for 771 patients. Sample availability is described in Supplementary Figure 1. The clinical characteristics of the patients included in the study are summarized in Table 1. All patients were Asians. The median age was 47 years (range, 23-79 years). A total of 66.7% and 33.3% of the patients were aged ≤50 years and >50 years, respectively. Most of the tumors were ductal carcinoma (85.1%), pN0 (80.3%), histologic grade 2 or 3 (82.2%), and nuclear grade 2 or 3 (91.8%).
In the classification of patients according to the BCT score, 68.4% (n = 527) of patients were included in the BCT lowrisk group, whereas 31.6% (n = 244) were in the BCT high-risk group ( Table 1 and Figure 1A). The proportion of BCT highrisk patients was higher in the node-positive (53.9%) than that in the node-negative subgroup (26.1%) (Figures 1B,C). Patients classified into the BCT high-risk group had significantly larger tumors (P < 0.001), more advanced pN status (P < 0.001), more advanced histologic grade (P < 0.001), and higher nuclear grade (P < 0.001) than those in the BCT low-risk group. No significant differences in age, PR status and histological type were observed between the two risk groups ( Table 1).

RS-Based Risk Classification
Patients were re-classified as low risk, intermediate risk, and high risk according to the RS results. The most frequent RS range was 11-15 (27.9%), followed by 18-25 (27.1%) ( Figure 1A). The RS distribution was similar between the lymph node-negative and node-positive subgroups (P = 0.341) (Figures 1B,C). However, a significant difference in the RS distribution according to age was observed in each nodal subgroup (P = 0.020 for the lymph node-negative and P = 0.035 for the node-positive subgroup) (Figure 2 The proportion of patients classified into the high-risk group according to the RS (8.9% using the clinical cut-off and 15.7% using the TAILORx cut-off) was lower than that of patients classified according to the BCT score (31.6%). In contrast to the BCT high-risk group, the RS high-risk group was not significantly associated with advanced pN status. Negative PR status was significantly correlated with a high RS (P < 0.001) (Supplementary Table 1).

Concordance Between the BCT Score and the RS
The concordance in risk stratification between the BCT score and the RS was analyzed using the RS ranges of TAILORx. The overall concordance between the two risk classifications was 71.9% when the RS low-risk and intermediate-risk groups were combined into one group (non-high-risk group, RS 0-25) ( Table 2). Of 527 patients in the BCT low-risk group, 480 (91.9%) were classified as non-high risk according to the RS. Subgroup analysis according to nodal status showed that the concordance between the two scores was different in the lymph node-negative and node-positive subgroups. The overall concordance was higher in the lymph node-negative subgroup (76.6%) than that in the node-positive subgroup (52.6%) ( Table 2).
We also assessed the concordance between the two scores according to age: ≤50 years and >50 years. Based on recent findings on the benefits of chemotherapy for patients with a RS midrange score (11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25) from TAILORx (23), patients were categorized into chemobenefit and non-chemobenefit groups using different RS ranges for each age subgroup. In patients aged ≤50 years, those with RS 0-15 and RS ≥16 were categorized into non-chemobenefit and chemobenefit groups, respectively, whereas in patients aged >50 years, the RS ranges used for the classification into non-chemobenefit and chemobenefit groups were RS 0-25 and RS ≥26, respectively. The overall concordance was higher in women aged >50 years (72.8%) than in those aged ≤50 years (52.9%) ( Table 2). However, in each nodal subgroup, the concordance results differed between patients aged ≤50 years and those aged >50 years. In patients with lymph node-negative breast cancer, the concordance was higher in those aged >50 years (77.5%) than in those ≤50 years (53.2%) ( Table 2). By contrast, in the lymph node-positive subgroup, the concordance was similar between patients aged >50 years (52.1%), and ≤50 years (51.9%) ( Table 2). The highest concordance between the two scores was observed in patients aged >50 years with lymph node-negative breast cancer.

Comparison of Clinical Risk by Modified Adjuvant! Online With the BCT Score and the RS
The clinical risk of patients was examined using the modified Adjuvant! Online, and the clinical risk classification was compared with that obtained using the BCT score or the RS. Overall, 409 (53.0%), and 362 (47.0%) patients were categorized as clinical low risk and high risk, respectively ( Figure 3A). Among patients in the clinical low-risk group, 11.5 and 9.8% were categorized as BCT high risk and RS high risk (≥26), respectively. Among patients in the clinical high-risk group, 45.6% and 77.6% were classified as BCT low risk and RS non-high risk (0-25), respectively. The clinical risk classification according to nodal status was different. The proportion of patients categorized as clinical high risk was higher in the lymph node-positive subgroup (85.5%) than that in the node-negative subgroup (37.5%) (Figures 3B,C). The difference between the clinical risk and the risk stratification using the two tests was greater in the lymph node-positive subgroup than that in the node-negative subgroup.
Of note, a recent secondary analysis of TAILORx trial on the integration of clinical risk to RS showed that the RS ranges predicting chemotherapy benefit are different in young women aged ≤50 years according to clinical risk (34). Clinical low-risk Frontiers in Oncology | www.frontiersin.org patients with RS 0-20 and RS ≥21 were categorized into nonchemobenefit and chemobenefit groups, whereas in clinical highrisk group, the RS ranges used for the classification into nonchemobenefit and chemobenefit groups were RS 0-15 and RS ≥16, respectively. Based on these findings, we further assessed the concordance between the BCT score and the RS in young patients aged ≤50 years. The overall concordance between the two risk classifications was 66.3% (341/514) and a higher concordance was observed in lymph node-negative subgroup (69.3% [284/410]) than node-positive subgroup (54.8% [57/104]) ( Table 3). Figure 4 shows the discordant results between the clinical risk and the risk classification using the two tests according to age within each nodal subgroup. In both nodal subgroups, the proportion of patients with discordant results between the clinical risk and risk by BCT score (i.e., either clinical low risk and BCT high risk or clinical high risk and BCT low risk) according to age was similar. By contrast, there was a difference in the proportion of patients with discordant results between the clinical risk and RS risk (i.e., either clinical low risk and RS chemobenefit or clinical high risk and RS nonchemobenefit) according to age. The RS categorized a higher proportion of patients into the chemobenefit group among clinical low-risk patients aged ≤50 years (21.2% [55/259] in the lymph node-negative subgroup and 12.5% [2/16] in the nodepositive subgroup) than among those aged >50 years (10.2% [13/128]  The risk stratification using the two tests in clinical highor low-risk patients was different in specific subpopulations. In patients aged ≤50 years within the lymph node-negative subgroup (n = 259), 21.2% of clinical low-risk patients were categorized into the chemobenefit group according to the RS, whereas 12.7% of patients were categorized as BCT high risk ( Figure 4A). Among clinical high-risk patients aged >50 years in the lymph node-positive subgroup (n = 42), 33.3 and 85.7% were classified as BCT low risk and non-chemobenefit, respectively, according to the RS (Figure 4D).
The prognostic value of the two scores was difficult to compare because of the short follow-up period. However, seven patients developed distant metastasis after surgery during the follow-up period in the present study. Both the BCT score and the RS categorized four of these patients as high risk (Supplementary Table 2).

Correlation of ER/PR/HER2 Expression With the BCT Score
The association of the two scores with the gene expression of ESR1, PGR, and ERBB2 was assessed. Consistent with the RS algorithm including ESR1 and PGR expression, there was a statistically significant trend toward lower ESR1 and PGR  expression among patients with a higher RS (Jonckheere-Terpstra test, P < 0.001) (Figure 5A). Similarly, PGR expression showed a decreasing trend in correlation with the BCT score (P = 0.046) ( Figure 5B). However, ESR1 expression increased as the BCT score increased (P < 0.001). ERBB2 expression showed a decreasing trend as the RS increased (P = 0.029), whereas no significant association between ERBB2 expression and the BCT score was observed. We also evaluated the correlation of the two scores with ER and PR expression by IHC. Negative correlation of ER (P = 0.002), and PR expression (P < 0.001) with the RS was observed (Figure 5C). There was no significant association between ER expression and the BCT score, whereas BCT score showed a negative correlation with PR expression (P = 0.002) ( Figure 5D).

Correlation of the RS With BCT Prognostic Genes
The correlation between the expression of six prognostic genes included in the BCT score and the RS was also examined. There was a statistically significant trend toward a higher expression of five proliferation-related genes (UBE2C, TOP2A, RRM2, FOXM1, and MK167) among patients with a higher RS (Jonckheere-Terpstra test, P < 0.001) (Supplementary Figure 2). Although the expression of the immune response-related gene BTN3A2 was negatively associated with the BCT score, it showed an increasing trend in correlation with the RS (P = 0.027).

DISCUSSION
The present study is the first to compare the BCT score and the RS for the risk classification of Asian patients with pN0-N1, hormone receptor-positive, HER2-negative breast cancer. The study is notable because of the inclusion of a large population of Asian patients from several institutions. The present results showed a moderate concordance of 71.9% between the two scores for risk stratification using the RS ranges reported in TAILORx. The discrepancy in the risk classification between the BCT score and RS may be attributable to the different gene sets and algorithms used to calculate the score. Moreover, the BCT score algorithm includes clinical factors (tumor size and nodal status), which are not included in the RS. When compared the RS risk group distribution in this study with previous studies, similar distribution was observed. In the present study, 105 (17.0%), 411 (66.4%), and 103 (16.6%) patients were classified as low risk, intermediate risk, and high risk in lymph node-negative subgroup using TAILORx cut-off (Supplementary Table 1 (23). RS pooled risk group distribution from several studies was: low risk, 52.6%; intermediate risk, 35.9%, and high risk, 11.5%, respectively, when RS risk groups were defined using the original clinical cut-off (35). These results are also similar to our findings.
The results showed that the agreement between the BCT score and the RS differed according to nodal status and age. Better concordance was found in the lymph node-negative subgroup than in the node-positive subgroup and in patients aged >50 years than in those ≤50 years. Accordingly, the highest concordance between the two scores for risk classification was observed in patients aged >50 years with lymph nodenegative breast cancer. This was related to the differences in risk assignment by the BCT score or the RS according to nodal status or age. The poor concordance in the lymph node-positive subgroup may be associated with the different risk assignment by the BCT score between the two subgroups. The proportion of patients classified as high risk according to the BCT was higher in lymph node-positive than that in node-negative patients, whereas the RS yielded a similar pattern of risk assignment between the two subgroups. Given that advanced nodal status is a strong unfavorable prognostic factor (36,37), it is not surprising that the proportion of patients categorized as BCT high risk was higher in the lymph node-positive subgroup than that in the nodenegative subgroup. By contrast, the distribution of RS ranges differed between the two age subgroups, whereas the BCT score distribution was similar in each age subgroup. This may explain the large difference in risk stratification by the two risk scores in women aged ≤50 years.
Following the previous TAILORx results, a recent secondary analysis of TAILORx trial further found that clinical risk stratification provided additional prognostic information to hormone receptor-positive, HER2-negative, lymph nodenegative breast cancer patients aged ≤50 years with RS 16-25 (34). Importantly, the study showed that there was no benefit from chemotherapy for women aged ≤50 years with RS 16-20 and at clinical low risk, whereas patients with RS 16-25 and at clinical high risk do benefit from chemotherapy. Based on these results, we categorized patients aged ≤50 years into non-chemobenefit and chemobenefit groups using different RS ranges according to clinical risk. Patients with RS 0-20 and RS ≥21 were categorized into non-chemobenefit and chemobenefit groups in clinical low-risk group, whereas in the clinical high-risk group, the RS ranges used for the classification of non-chemobenefit and chemobenefit groups were RS 0−15 and RS ≥16, respectively and we assessed the concordance in risk stratification between the two tests. Similar to the agreement between the two risk classifications not considering clinical risk, the concordance in patients aged ≤50 years was lower than that in patients aged >50 years. The agreement between clinical risk and risk stratification using the two tests varied depending on age. In the subgroup analysis by age in each nodal subgroup, the proportion of patients with discordant results between clinical risk and RS risk was different between patients aged ≤50 years and those >50 years. The risk stratification using the two tests in clinical high-or low-risk patients was different in specific subpopulations including patients aged ≤50 years with lymph node-negative breast cancer and patients aged >50 years with lymph node-positive breast cancer. These results raised a question regarding which risk stratification is more appropriate in these subpopulations. Moreover, these results suggest the need for further studies to identify more accurate risk score for predicting the risk of recurrence or chemotherapy benefit in Asian breast cancer patients aged ≤50 years.
Because the clinical data was based on a short follow-up period, a direct comparison of the prognostic and predictive values of the BCT score with the RS was not possible in this study. Therefore, the results are not sufficient to determine which test is more accurate for predicting the risk of recurrence or chemotherapy benefit in hormone receptor-positive, HER2negative early breast cancer. However, the BCT high-risk group was significantly associated with larger tumor size and advanced nodal status, whereas the RS showed no significant relationship with nodal status. Moreover, in a recent study that compared the prognostic value of six multigene signatures in postmenopausal patients with ER-positive, HER2-negative breast cancer, combined genomic and clinical models such as ROR and EPclin were more prognostic for late distant recurrence than other molecular signatures in lymph node-positive patients (15). These findings suggest that the BCT score based on combined gene expression and clinical variables, is likely to have a better prognostic value than RS in lymph node-positive patients.

CONCLUSIONS
The present results showed a moderate accordance in risk assignment between the two scores, whereas the concordance was lower in patients aged ≤50 years or those with lymph nodepositive disease. Further studies are necessary to directly compare the prognostic and predictive values of the two tests in Asian breast cancer patients aged ≤50 years.

DATA AVAILABILITY
All datasets generated for this study are included in the manuscript and/or the Supplementary Files.

ETHICS STATEMENT
The study was approved by the review board of five institutions (Samsung Medical Center, Asan Medical Center, Korea University Guro Hospital, Gangnam Severance Hospital in Seoul, and National Cancer Institute in Gyeonggi-do) in Korea and was performed in accordance with the Declaration of Helsinki. Because the study was retrospective in nature, the requirement for informed consent was waived.

AUTHOR CONTRIBUTIONS
YKS and GG conceived the study and participated in its design. JEL, JJ, SUW, SBL, SL, Y-LC, and YK were involved in data acquisition. MJK and JH drafted the manuscript. MJK, JEL, JJ, SUW, JH, GG, and YKS analyzed and interpreted the data. JH performed statistical analyses. BK, J-EK, YM, and KS provided administrative, technical, or material support. JEL, JJ, SUW, GG, and YKS participated in critical revision of the manuscript with respect to important intellectual content. YKS supervised the study. All authors read and approved the final manuscript.

FUNDING
We are very grateful for the financial support of the Research Institute of Pharmaceutical Sciences, Seoul National University College of Pharmacy.