Validation of the Prognostic Stage of American Joint Committee on Cancer Eighth Edition Staging Manual in Invasive Lobular Carcinoma Compared to Invasive Ductal Carcinoma and Proposal of a Novel Score System

Purpose: The objective of this study was to evaluate the American Joint Committee on Cancer (AJCC) pathological prognostic stage among patients with invasive ductal carcinoma (IDC) and invasive lobular carcinoma (ILC) and to propose a modified score system if necessary. Methods: Women diagnosed with IDC and ILC during 2010–2015 in the Surveillance, Epidemiology, and End Results (SEER) database were retrospectively identified. Disease-specific survival (DSS) and overall survival (OS) were estimated by Kaplan–Meier method. Predictive performances of different staging systems were evaluated based on Harrell concordance index (C-index) and Akaike Information Criterion (AIC). Multivariate Cox models were conducted to build preferable score systems. Results: A total of 184,541 female patients were included in the final analyses, with a median follow-up of 30.0 months. In IDC cohort, the pathological prognostic stage (C-index, 0.8281; AIC, 110274.5) was superior to the anatomic stage (C-index, 0.8125; AIC, 112537.0; P < 0.001 for C-index) in risk stratification with respect to DSS. In ILC cohort, the prognostic stage (C-index, 0.8281; AIC, 7124.423) didn't outperform the anatomic stage (C-index, 0.8324; AIC, 7144.818; P = 0.748 for C-index) with respect to DSS. Similar results were observed with respect to OS. The score system defined by anatomic stage plus grade plus estrogen receptor and progesterone receptor (AS+GEP) allows for better staging (C-index, 0.8085; AIC, 7178.448) for ILC patients. Conclusion: Compared with anatomic stage, the pathological prognostic stage provided more accurate stratification for patients with IDC, but not for patients with ILC. The AS+GEP score system may fit ILC tumors better.


INTRODUCTION
Tumor staging is of critical significance in risk stratification and prognosis prediction for breast cancer. Since its first publication in 1977, the American Joint Committee on Cancer (AJCC) Cancer Staging Manual has been periodically revised and updated to improve its predictive accuracy in stratifying patients outcomes (1). Historically, the standardized classification system was solely based on anatomic extent of primary breast tumor, lymph node, and metastasis (TNM) (1). With better understanding toward tumor biology, the importance of adding biological factors as complementary to conventional staging system has been recognized (2)(3)(4)(5). Therefore, AJCC 8th edition staging manual introduced the prognostic stage system (PS) by incorporating biomarkers including estrogen receptor (ER) and progesterone receptor (PR) expression, human epidermal growth factor receptor 2 (HER2) status, tumor grade, as well as multigene assays when available, with TNM classification, while maintaining the TNM-based anatomic stage system(AS) (6). And after further analysis based on National Cancer Database (NCDB), the AJCC Breast Expert Panel provided an updated version of the breast staging manual to further refine the patient stratification (7). The PS has been previously validated in invasive breast cancer and proved to be superior to the AS (8)(9)(10)(11). However, the prognostic value of the PS in different histology subtypes of breast cancer has not been evaluated yet, which requires further validation.
Invasive ductal carcinoma (IDC) and invasive lobular carcinoma (ILC) are the two most common histological types of invasive breast cancer, with IDC occurring in about three fourths of patients and ILC approximately accounting for 10-12% of all cases (12)(13)(14)(15). As reported in previous studies, the clinical and biological characteristics were different between IDC and ILC. Compared to patients with IDC, patients with ILC are generally associated with an older age at diagnosis, larger tumor size, lower tumor grade, and more frequent lymph node involvement (12,14,16). Regarding biomarkers, ILC is more likely to be ER/PR-positive and HER2-negative (12,14). From the treatment perspective, ILC was reported to be less sensitive to chemotherapy than IDC (13,17), even in the genomic intermediate/high risk group (18). Despite these distinctive differences, studies specially focused on ILC were relatively insufficient. Until now, the prognostic value of PS has not been exclusively evaluated in this specific histological type of breast cancer.
In our study, the objective is to assess and compare the predictive performances of AS and PS in both IDC cohort and ILC cohort, and furthermore, to propose a modified prognostic staging score system in case the current PS did not perform ideally in the ILC subtype.

Data Source and Study Cohort
This retrospective study was conducted using data from the Surveillance, Epidemiology, and End Results (SEER) database which collected data from 18 population-based cancer registries, approximately representing 28% of the US population.
Patients meeting the following inclusion criteria were identified as potentially eligible patients: (1) female; (2) years at diagnosis from 2010 to 2015; (3) histologically confirmed breast cancer as the primary and only malignant tumor; (4) histological breast cancer subtypes were IDC (8500/3) and ILC (8520/3) according to International Classification of Diseases for Oncology, Third Edition (ICD-O-3); (5) having received a mastectomy or a breast-conserving surgery (BCS) as surgical treatments. Patients without available information on biomarkers including tumor grade, ER, PR, and HER2 status, and patients diagnosed by death certificate or autopsy only were further excluded.
Data retrieved from SEER database included as follows: age at diagnosis, race, histological subtypes, anatomic features including tumor size and lymph node involvement, biomarkers including tumor grade, ER, PR, and HER2 status and treatment information such as surgical procedure, chemotherapy, and radiation. Patients were assigned to different stages according to the AS and PS in the AJCC 8th edition staging manual (7). AS was defined by traditional TNM classification, while PS was defined by TNM classification and additional biomarkers including tumor grade, ER, PR, and HER2 status. The PS was divided into clinical prognostic stage and pathological prognostic stage in the updated version of staging manual. In this study, pathological prognostic stage was applied and PS referred to pathological prognostic stage.

Statistical Analysis
The demographic and tumor characteristics were compared between patients with IDC and ILC using Pearson's Chi-square test or Fisher's exact test when necessary. The disease-specific survival (DSS) was calculated from the time of diagnosis to the time of death from breast cancer. The overall survival (OS) was calculated from the time of diagnosis to the time of death from any causes. The survival was estimated by Kaplan-Meier method and were compared by log-rank test. The Cox proportional hazards model was utilized to analyze the univariate and multivariate association of each potential prognostic factor with DSS and OS, and to calculate hazard ratio (HR) and 95% confidence interval (CI). The predictive performances of different staging systems were quantified and compared based on Harrell's concordance index (C-index) and Akaike information criterion (AIC), which were calculated from Cox models adjusted by age at diagnosis, race, surgery types, receipt of chemotherapy, and radiation. A higher C-index indicates a better discriminatory ability among each staging system (19). A lower AIC indicates a more effective model in predicting outcomes (20).
A two-tailed P < 0.05 was considered statistically significant. All of the statistical analyses were conducted using STATA (version 14.0, College Station, TX, US).

Model Building
Corresponding with other published studies (3,5), DSS was determined as the clinical endpoint when the staging score systems were created. AS was considered as a reference stage to derive the novel scoring system. Univariate analyses were conducted to evaluate the association between DSS and potential prognostic factors including tumor grade, ER status, PR status, and HER2 status. The Cox models based on AS were performed to assess the prognostic significance of adding other candidate factors. Only factors associated with DSS in univariate analyses (p < 0.05) could be included in multivariate Cox models. Therefore, the first model was based on AS. The second model incorporated AS and tumor grade. The third model included AS, tumor grade, ER, and PR status. HER2 status was excluded because it was not significantly associated with DSS, which would be further explained in the results section. Scoring systems were created according to the multivariate analysis results of the three Cox models. Scores were assigned to each independent prognostic factor of DSS (p < 0.05). For binary variables, the comparison group with significant impact on DSS was assigned one point. For ordinal variables, the comparison groups determined to have a significant impact on DSS with an HR between 1.01 and 4 were assigned one point, variables with an HR between 4.01 and 8 were assigned two points, variables with an HR between 8.01 and 12 were assigned three points and variables with an HR over 12 were assigned four points. The final score was obtained by summing scores for all independent predictors of DSS. Finally, three score system were created. The first score system was solely based on AS. The second score system included the AS and tumor grade (AS+G). The third score system evolved the AS, tumor grade, ER status, and PR status (AS+GEP). The predictive performances were quantified and compared using C-index and AIC (19,20).

Clinical and Biological Features
A total of 201,075 patients in SEER database met the eligible criteria. Among 180,652 patients diagnosed with IDC, 10,413 (5.7%) patients with unknown ER, PR, or HER2 status and 4,155 (2.3%) patients without tumor grade information were further excluded. Among 20,423 patients diagnosed with ILC, 970 (4.7%) patients with unknown ER, PR or HER2 status and 996 (4.9%) patients without tumor grade information were further excluded. A total of 184,541 female patients were included in the final study. The IDC cohort consisted of 166,084 (89.9%) patients while the ILC cohort consisted of 18,457 (10.1%) patients. The median age of the whole population was 60 years (range 18-98). The demographic, clinicopathological characteristics, and treatment disposition of each cohort were summarized in Table 1.
Distinct differences in clinicopathological features between IDC cohort and ILC cohort were observed. There was a significant higher percentage of patients aged 60 and younger in IDC cohort than ILC cohort (53.3 vs. 42.6%, P < 0.001). Patients with ILC were more likely to have mastectomy than BCS compared to those with IDC (49.6 vs. 39.1%, P < 0.001). Significant differences were observed in pT stage and pN stage distribution among patients with IDC and ILC (P < 0.001). Patients with ILC was associated with larger tumor size and more lymph node involvement. With regard to biomarkers, ERpositive tumors, and PR-positive tumors were more common in ILC cohort than in IDC cohort (ER: 98.4 vs. 81.0%, P < 0.001; PR:

Stage Distribution and Migration
Patients in both cohorts were restaged according to the AS and the PS proposed in the AJCC 8th edition staging manual. The distribution of stages applying the AS and the PS were listed in  Table 3.

Survival Outcomes and Comparison of Predictive Performance
In this study, the median follow-up duration was 30.0 months (range, 0-71 months). The estimated 4-years DSS and OS according to different stages and histological subtypes were summarized in Table 2 and the corresponding survival curves were demonstrated in Figure 1. According to the univariate analyses, the DSS of different stage groups by the AS and the PS were significantly different in both cohorts (all P < 0.001, Table 2). Similar results were observed for OS. Cox proportional hazard regression models adjusted with age, race, surgery types, and receipt of chemotherapy and radiation was performed for subsequent statistical analyses. According to multivariate analyses using stage IA as reference, the differences in DSS and OS among stage groups remained significant in both cohorts (all P < 0.001, Table 4).
In the IDC cohort, C-index was 0.8454 for the PS, vs. 0.8125 for the AS; and AIC was 110274.5 for the PS, vs. 112537.0 for the AS according to the Cox model using DSS as endpoint (Figure 1). The PS represented a significant higher C-index (P < 0.001) and a lower AIC compared to AS, which indicated that the PS was a more effective model in predicting prognosis among patients with IDC. Similar results were seen for OS (Figure 1).
In the ILC cohort, C-index was 0.8281 for the PS, vs. 0.8324 for the AS according to the Cox model using DSS as endpoint. There was no significant difference in C-index between the AS and the PS (P = 0.748). Moreover, AIC was similar between the AS and the PS (7144.8 vs. 7124.4), indicating PS was not superior to AS in predicting prognosis. Similar results were seen for OS (Figure 1).
Likewise, the results of statistical assessment in the subgroup patients after excluding those received chemotherapy and among patients with ER-positive and HER2-negative tumors also manifested that PS provided better risk stratification with significantly higher C-index and lower AIC compared to AS in IDC cohort but not in ILC cohort (Supplement Tables 1, 2).

New Score System for ILC
Revised score systems were applied in ILC tumors to optimize risk stratification among patients with ILC.
The results of univariate and multivariate analyses for potential prognostic factors associated with DSS were summarized in Table 3. According to univariate analyses, HER2 status was not related to DSS (P = 0.253), while the AS, tumor grade, ER status, and PR status were all prognostic factor for DSS in ILC cohort (all P < 0.001). Therefore, multivariate models including these prognostic factors were constructed to assess the prognostic value of each factor and determine score assignment. The first Cox model was based on AS. The second Cox model combined AS and tumor grade. The third Cox model incorporated AS, tumor grade, ER, and PR status. According to multivariate analyses, stage IIA, IIB, IIIA, IIIB, and IIIC patients had worse DSS in comparison with stage IA patients. Tumor grade, ER status, and PR status were all independently associated with DSS in each model (all P < 0.001). Accordingly, three score systems were established and scores were assigned for these independent predictors based on hazard ratio as described in the method. The detailed information about point assignments for independent predictors of DSS were shown in Table 5.
The first score system was solely based on AS. The second score system included the AS and tumor grade (AS+G). The AJCC, American Joint Committee on Cancer; AS, anatomic stage; PS, prognostic stage. Percent frequency in the boxes represents the distribution of prognostic stages in the same anatomic stage (for e.g., among patients with IDC, 91.02% of anatomic stage IA patients remained stage IA and 8.98% of them were upstaged to stage IB when applying prognostic stage system). Red boxes represent patients having an upstaged prognostic stage, and green boxes represent those who were downstaged after applying prognostic stage system, whereas those in blue maintained an unchanged stage.
third score system evolved the AS, tumor grade, ER status, and PR status (AS+GEP). Figure 2 demonstrated the DSS curves for each score system. By using all three score systems, there were significant differences in DSS among patients in different score groups (all P < 0.001). The AS+GEP score system exhibited the higher C-index (0.8085 vs. 0.7925, P = 0.002) and lower AIC (7178.448 vs. 7247.481) when compared to AS score system, which indicated integrating tumor grade, ER and PR status with AS could improve the stratification ability of score system. The estimated 4-years DSS outcomes for ILC cohorts categorized by AS+GEP score system were listed in Table 6. Sensitivity analyses conducted among patients without chemotherapy and patients with ER-positive and HER2-negative tumors also showed that the AS+GEP score system was superior to the AS score with lower AIC and higher C-index though the higher C-index was not statistically significant (Supplement Tables 3, 4).

DISCUSSION
With the development of tumor biology research, it is wellacknowledged that biomarkers can provide additional prognostic information beyond tumor size and lymph node status (21)(22)(23)(24). Accordingly, the AJCC 8th edition staging manual introduced ER, PR, HER2 status, and tumor grade into the staging system to refine risk stratification. Our study was conducted to validate and evaluate the pathological prognostic staging system in patients with IDC and ILC, two most common histology types in invasive breast cancer.
Previous studies have validated the superiority of the PS compared with the AS in predicting survival (8)(9)(10)(11). Weiss et al. reported that the PS provided more accurate stratification compared with the AS in both cohorts from MD Anderson Cancer Center and from California Cancer Registry (10). Wang    (8). However, previous series mainly focused on the comparison of the AS and the PS in invasive breast cancer population dominated by IDC, while none compared them in different histological subtypes. The potential impact of histological subtypes on predictive value of the new staging system remained to be investigated. To our knowledge, this study is the first large populationbased report that validated the prognostic value of the PS from AJCC 8th edition staging manual in both IDC and ILC cohort. As described in previous series (14)(15)(16), distinctive differences in tumor features, treatment options, and recurrence patterns were observed between patients with IDC and patients with ILC. Therefore, it was of important significance to analyze the prognostic value of PS in the two different histological subtypes separately. In concordance with previously published studies (9,25), the PS was superior to the AS in providing risk stratification information among patients with IDC. However, the PS didn't outperform the AS in predicting prognosis among patients with ILC according to our analyses.
The possible reasons may go as follows. To begin with, disparity between IDC and ILC in the distribution of clinicopathological features may have contributed to the divergent predictive performances of the AS and the PS. In line with previous series (14), our data showed that ILC was associated with heavier tumor burden at diagnosis, lower tumor grade, higher percentage of hormone receptor (HR)-positive, and HER2-negative tumors compared to IDC. Additionally, different prognostic importance of biomarkers weighed in IDC and ILC may influence the predictive performances of the PS. It has been reported that tumor grade similarly affected the   (26,27). Magnitude of benefit of adjuvant letrozole was also proved to be greater in ILC according to the analyses conducted in The Breast International Group (BIG) 1-98 population (27). The different response to systemic therapy between ILC and IDC may to some extent affect the efficacy of the PS in stratifying patients.
As the PS proposed in AJCC 8th edition staging manual failed to show superiority to AS in risk stratification among ILC patients, this study established a new risk-score point-based system specialized for ILC tumors to provide refinements on staging system. The major finding was that AS +GEP score system consisting of the AS, tumor grade, ER status, and PR status had the highest C-index and lowest AIC, indicating that the score system including biomarkers allowed for more refined patient classification in ILC population compared with that merely based on anatomic factors. According to AS+GEP score system, 68.8% of patients had a score of 0-2, with corresponding 4-years DSS > 97%. Our analyses indicated that the PS couldn't improve the risk stratification beyond AS after downstaging 50.5% of patients and upstaging 0.6% of patients. Different from PS, HER2 status was left out in the novel score system for it was not significantly associated with DSS according to our analyses, and this might lead to more rational stage migration. Moreover, the AS+GEP score system was more concise and easier to be used in clinical practice compared to PS. Sensitivity analyses conducted among patients without chemotherapy and patients with ER-positive and HER2-negative tumors further confirmed the superiority of AS+GEP score system with lower AIC and higher C-index compared to the AS score system though the higher C-index were not statistically significant. Because the AS+GEP score system incorporated ER status into the scoring system, its ability of risk stratification might be slightly weakened when analyses were restricted to ER-positive and HER2-negative patients. And the non-significant higher Cindex of AS+GEP score system compared to AS among patients without chemotherapy might suggest that the superior predictive performance of AS+GEP score system was possibly due to its better risk stratification for patients with higher risk.
Limitations of the current study presented in the following aspects. One limitation lied in the lack of Oncotype DX recurrence score (RS) data in the present study. The PS incorporated RS into staging system and downstaged patients with T1-2N0M0, ER-positive, and HER2-negative tumors into stage IA when RS < 11 for the reason that these patients were observed exceptional survival outcomes (7). Similar with other published studies concerning the validation of PS, our analyses didn't include RS in PS due to the unavailability of RS data. Another limitation was that patients receiving neoadjuvant systemic therapy were unable to be excluded from the study cohort because of the insufficient treatment information provided by SEER database. However, subgroup analyses excluded patients with receipt of chemotherapy were conducted to alleviate bias, and similar results were observed which further confirmed our main finding. The relatively short follow-up was also a major limitation in our study. For the reason that HER2 status was not recorded in SEER database until 2010, patient selection was restricted to 2010-2015, which result in the limited follow-up. In particular, ILC was characterized by higher likelihood of late recurrence compared to IDC, so the median follow-up of 30.0 months might be inadequate for survival analyses in ILC cohort. Further studies with longer follow-up were needed to reach more robust conclusions. Moreover, information about the receipt of anti-HER2 therapy among HER2-poisitive patients was not provided in SEER database, which constituted another limitation of our study. However, a great majority of patients with HER2+ tumors may have received anti-HER2 therapy because only patients treated between 2010 and 2015 were included in the analyses. Other limitations consisted in those inherent in retrospective analyses.

CONCLUSION
In conclusion, the current study validated that the PS was superior to AS in risk stratification among patients with IDC, while it failed to outperform AS among patients with ILC. Among risk score systems specially designed for ILC tumors, the AS+GEP score system could provide more precise prognostic information. Further studies should strive to refine staging system for patients with specific breast cancer subtypes.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: National Cancer Institute's SEER program (https://seer.cancer.gov/).

AUTHOR CONTRIBUTIONS
SD, JW, and LZ: conceptualization. CL, WeilC, and DL: methodology. SD and JW: formal analysis, investigation, and writing-original draft preparation. LZ and YZ: writing-review and editing. LZ: funding acquisition. WeigC, YL, KS, and LZ: supervision. All authors contributed to the article and approved the submitted version.

FUNDING
This study was supported by National Natural Science Foundation of China (Grant No. 81572581) and Technology Innovation Act Plan of Shanghai Municipal Science and Technology Commission (Grant No. 16411966900). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.