Prognostic Value of the New Prostate Cancer International Society of Urological Pathology Grade Groups

Gleason grading is the best independent predictor for prostate cancer (PCa) progression. Recently, a new PCa grading system has been introduced by the International Society of Urological Pathology (ISUP) and is recommended by the World Health Organization (WHO). Following studies observed more accurate and simplified grade stratification of the new system. Aim of this study was to compare the prognostic value of the new grade groups compared to the former Gleason Grading and to determine whether re-definition of Gleason Pattern 4 might reduce upgrading from prostate biopsy to radical prostatectomy (RP) specimen. A cohort of men undergoing RP from 2002 to 2015 at the Hospital of Goeppingen (Goeppingen, Germany) was used for this study. In total, 339 pre-operative prostatic biopsies and corresponding RP specimens, as well as additional 203 RP specimens were re-reviewed for Grade Groups according to the ISUP. Biochemical recurrence-free survival (BFS) after surgery was used as endpoint to analyze prognostic significance. Other clinicopathological data included TNM-stage and pre-operative PSA level. Kaplan–Meier analysis revealed risk stratification of patients based on both former Gleason Grading and ISUP Grade Groups, and was statistically significant using the log-rank test (p < 0.001). Both grading systems significantly correlated with TNM-stage and pre-operative PSA level (p < 0.001). Higher tumor grade in RP specimen compared to corresponding pre-operative biopsy was observed in 44 and 34.5% of cases considering former Gleason Grading and ISUP Grade Groups, respectively. Both, former Gleason Grading and ISUP Grade Groups predict survival when applied on tumors in prostatic biopsies as well as RP specimens. This is the first validation study on a large representative German community-based cohort to compare the former Gleason Grading with the recently introduced ISUP Grade Groups. Our data indicate that the ISUP Grade Groups do not improve predictive value of PCa grading and might be less sensitive in deciphering tumors with 3 + 4 and 4 + 3 pattern on RP specimen. However, the Grade Group system results less frequently in an upgrading from biopsy to the corresponding RP specimens, indicating a lower risk to miss potentially aggressive tumors not represented on biopsies.

Gleason grading is the best independent predictor for prostate cancer (PCa) progression. Recently, a new PCa grading system has been introduced by the International Society of Urological Pathology (ISUP) and is recommended by the World Health Organization (WHO). Following studies observed more accurate and simplified grade stratification of the new system. Aim of this study was to compare the prognostic value of the new grade groups compared to the former Gleason Grading and to determine whether re-definition of Gleason Pattern 4 might reduce upgrading from prostate biopsy to radical prostatectomy (RP) specimen. A cohort of men undergoing RP from 2002 to 2015 at the Hospital of Goeppingen (Goeppingen, Germany) was used for this study. In total, 339 pre-operative prostatic biopsies and corresponding RP specimens, as well as additional 203 RP specimens were re-reviewed for Grade Groups according to the ISUP. Biochemical recurrence-free survival (BFS) after surgery was used as endpoint to analyze prognostic significance. Other clinicopathological data included TNM-stage and preoperative PSA level. Kaplan-Meier analysis revealed risk stratification of patients based on both former Gleason Grading and ISUP Grade Groups, and was statistically significant using the log-rank test (p < 0.001). Both grading systems significantly correlated with TNM-stage and pre-operative PSA level (p < 0.001). Higher tumor grade in RP specimen compared to corresponding pre-operative biopsy was observed in 44 and 34.5% of cases considering former Gleason Grading and ISUP Grade Groups, respectively. Both, former Gleason Grading and ISUP Grade Groups predict survival when applied on tumors in prostatic biopsies as well as RP specimens. This is the first validation study on a large representative German community-based cohort to compare the former Gleason Grading with the recently introduced ISUP Grade Groups. Our data indicate that the ISUP Grade Groups do not improve predictive value of PCa grading and might be less sensitive in deciphering tumors with 3 + 4 and 4 + 3 pattern on RP specimen. However, the Grade Group system results less frequently in an upgrading from biopsy to the corresponding RP specimens, indicating a lower risk to miss potentially aggressive tumors not represented on biopsies.
Keywords: prostate cancer, gleason score, international society of Urological Pathology grade groups, prognostic biomarker, cancer grading inTrODUcTiOn Prostate cancer (PCa) is the most common cancer type among men worldwide accounting for more than 20% of all newly diagnosed cancers (1). Patients show a highly variable course of disease, resulting in a major challenge for clinical management (2). Consequently, it is of utmost importance to stratify patients with early-stage disease into certain risk groups, predicting the probability of remaining an indolent or progressing to an aggressive form of PCa. In addition to clinical stage and serum PSA level, Gleason grading of the tumor is a powerful prognostic marker at time point of diagnosis and has major impact on therapy decision (3). A limitation of Gleason grading is upgrading from biopsy to radical prostatectomy (RP) specimens, most often in lower grade tumors, which is associated with worse outcome of patients (4). In addition, considerable interobserver variability limits PCa grading to be reproducible in a subset of cases (5).
Since its introduction (6), the Gleason Grading system has undergone several revisions in order to improve reproducibility and prognostic value (7). Most recently, a new grading system has been proposed by the International Society of Urological Pathology (ISUP) (8) and is integrated into the 2016 edition of the WHO classification of Tumor of the Urinary System and Male Genital Organs (9). Major modifications of the ISUP Grade Groups toward the last update in 2010 include defining five distinct Grade Groups based on the Gleason score: Grade Group 1 = Gleason score ≤6, Grade Group 2 = Gleason score 3 + 4 = 7, Grade Group 3 = Gleason score 4 + 3 = 7, Grade Group 4 = Gleason score 8, Grade Group 5 = Gleason scores ≥9, as well as modified morphological criteria for Gleason pattern 4 (9). Consequently, various growth patterns, including ill-formed, fused, cribriform, and glomeruloid glands are considered as Gleason grade 4 tumors. In consideration of a multi-institutional validation study (10), the ISUP Grade Groups emerged as more accurate and simplified classification to stratify tumors than the current system (9).
A number of recently published studies independently validated the ISUP Grade Groups as prognostic marker for bio chemical recurrence (BR) as well as disease-specific death of patients (11)(12)(13)(14). While some observed prognostic benefits of the new classification, others reported no significant difference to the former Gleason Grading (12,15). It has to be considered that the mentioned studies vary regarding end points for survival analysis and treatment strategies of patients, resulting in limited comparability.
The aim of our study was to compare the prognostic value as well as the frequency of upgrading from diagnostic biopsy to RP specimen between the former Gleason Grading and the ISUP Grade Groups. Diagnostic biopsies and RP specimens were re-reviewed and graded according to the new classification.

MaTerials anD MeThODs
A cohort including 339 prostatic pre-operative biopsies and corresponding RP specimens, as well as additional 203 RP specimens from patients who were treated between 2002 and 2015 at the Hospital of Goeppingen (Goeppingen, Germany) was used for the present study. Only biopsies from patients who achieved subsequent RP were included for survival analyses. All RP specimens and biopsies were initially graded by pathologists of the Hospital of Goeppingen, and afterward re-graded in a centralized manner by an expert GU pathologist according to the current ISUP grading system. BR was defined as postoperative PSA increase of ≥0.2 ng/ml. Patients' characteristics are summarized in Table 1.
Ethical approval for using human material in this study was obtained from the Internal Review Board of the University Hospital of Bonn (264/11). The study participants were anonymized before their specimens were included to this retrospective study cohort.
Chi-square tests were used for comparison of BR between the former Gleason Grading and the ISUP Grade Groups as categorical variables. To analyze univariable differences in BFS between the former Gleason Grading and the ISUP Grade Groups, the log-rank test was used. Kaplan-Meier curves illustrate BFS after treatment stratified by the former Gleason Grading and the ISUP Grade Groups. Multivariable Cox proportional hazards models were performed to identify independent prognostic factors. Grading on biopsies were adjusted for the log of pre-operative PSA (≤/> 10 ng/ml), and grading on RP specimens were adjusted for the log of pre-operative PSA (≤/> 10 ng/ml), pathological T-stage (pT2, pT3a, pT3b, pT4), lymph node status (pN0, pN1), and surgical margin status (pR0, pR1). Gleason Score 3 + 3 and Grade Group 1 were used as reference group for hazard ratios.
Study design, patient characteristics, methods, and statistical analysis as well as data presentation and discussion have been performed according to the REMARK (REporting recommendations for tumor MARKer prognostic studies) guidelines (16).

resUlTs
Biochemical recurrence-free survival was used as end point to compare prognostic value of different grading systems. The   1A,B and 2A,B). Concordantly, higher former Gleason Grade and ISUP Grade Group associates with lower 5-year-BFS rates of patients (Figures 1C and 2C). The (Figures 1B,C). Using former Gleason Score 3 + 3 and ISUP Grade group 1 as reference group in multivariable Cox analysis, RP specimens with former Gleason Score (ISUP Grade group) 3 + 4 (2), 4 + 3 (3), 8 (4), >9 (5) are associated with 1.78 (4.08), 4.81 (10.25), 6.53 (7.81) and 6.57 (12.93) times increased hazard for BR, respectively ( Table 3). When determined at diagnostic biopsies, former Gleason Score (ISUP Grade group) 3 + 4 (2), 4 + 3 (3), 8 (4), >9 (5) is associated with 2.07 (3.25), 3.55 (3.98), 4.55 (4.93), and 7.11 (9.40) times increased hazard for BR, respectively ( Table 4). In multivariate analyses, RP specimens were adjusted to lymph node status, surgical margin status, T-stage, and pre-operative PSA ( Table 3), and biopsies to pre-operative PSA ( Table 4). In multivariate analysis, RP specimens graded as 3 + 4 by the former Gleason Grading is associated with 1.78 times increased hazard for BR referring to 3 + 3 tumors which is statistically not significant (0.058) ( Table 3). Collectively, hazard ratios for BR are higher for tumors graded by the ISUP Grade Groups compared to the former Gleason Grading. frequency of BR at any time point after treatment increases with both rising former Gleason Grading and ISUP Grade Groups as assessed on RP specimens as well as diagnostic biopsies ( Table 2). The rate of BR of tumors graded with 3 + 3 on RP (9.2%) as well as biopsies (21.4%) was higher compared to tumors graded with Grade Group 1 (5.9% at RP, 13.7% at biopsy). In RP specimens, we observed lower frequency of BR in Grade Group 4 tumors (42.9%) compared to Grade Group 3 tumors (53.7%), but clearly higher and lower rate compared to Grade Group 2 (23.1%) and Grade Group 5 (71.9) tumors, respectively ( Table 2). Statistical analysis revealed significant association between both, the former Gleason Grading and the ISUP Grade Groups, and risk for BR (Chi-Square, p < 0.001) ( Table 2).
Kaplan-Meier curves illustrate reduced cumulative BFS time in tumors with higher former Gleason Grading and ISUP Grade Groups at both, RP and diagnostic biopsy, allowing significant Overall, in nine patients, disease recurrences appeared as local tumor recurrence or development of distant metastases (assigned as clinical recurrence), partially after diagnosis of BR. Only 6 patients died from PCa. Association between different grading systems and clinical recurrence and disease-specific death are listed in Table 2.
Both the former Gleason Grading and the ISUP Grade Groups significantly correlate with T-stage, lymph node status and preoperative PSA level (Chi-square p < 0.001).
Upgrading from diagnostic biopsy to RP specimen occurred in 34.5% considering ISUP Grade Groups, and more often considering the former Gleason Grading (44.0%) ( Table 3). The vast majority of cases affected low grade tumors independent of the grading system. Frequency and distribution of upgrading are listed in Table 5.

DiscUssiOn
Collectively, our findings confirm that the recently introduced ISUP Grade Groups independently predicts BR after treatment when conducted on both, RP specimens and diagnostic biopsies. Thus, these data support previous suggestions of the ISUP meeting 2014 to include ISUP Grade Groups into pathology reports (9).
Patients with low-risk disease are constantly risk stratified as tumors on diagnostic biopsies with ISUP Grade Group 2, 3, 4, and 5 exhibited a 3.25, 3.98, 4.93, and 9.40 hazard increased risk for BR compared to ISUP Grade Group 1 tumors, respectively ( Table 4). Importantly, therapy decision is made based on the grading of diagnostic biopsies, PSA blood level, clinical tumor stage, and individual factors. With the objective to avoid over-treatment of indolent PCa, alternative regimes such as active surveillance and watchful waiting are increasingly applied worldwide (17). Among others, criteria for active surveillance include Gleason Score ≤6 in biopsy specimens to select patients with potential low-risk disease (18). Based on our results showing a 5-year BFS rate of 88.8% in ISUP Grade Group 1 biopsies, the recently introduced ISUP Grade Groups are sensitive markers to identify low-risk patients who should not undergo radical treatment approaches. In our evaluation, the ISUP Grade Groups emerged as even more sensitive compared to the former Gleason Grading (5-year BFS rate of 80.6% in former Gleason Score 3 + 3 biopsies) (Figure 2C).   Various growth patterns, including ill-formed, fused, cribriform, and glomeruloid glands, are considered as Gleason grade 4 tumors in the ISUP Grade Groups (9). In general, diagnosis of Gleason pattern 4 is associated with the highest interobserver variability among pathologists. Evaluating the interobserver reproducibility of individual Gleason pattern 4,  (12), reporting a 4-year BFS rate of 77% for ISUP Grade Group 4 on biopsy, and a slightly lower rate for ISUP Grade Group 3 tumors (74%). Obviously, this result derives from grading on biopsies associated with upgrading to RP, and the discrepancy is marginal. However, overall data give evidence that separation Gleason pattern 3 from pattern 4 and reporting its proportion remains challenging. Multiple studies revealed upgrading from diagnostic biopsy to corresponding RP specimen reporting incidences between 14 and 51% with a mean of 36% (20). Major issues emerged during recent years as Gleason grading on diagnostic biopsy has significant impact on therapy decision at time point of diagnosis to select patients for surgery, radiation or active surveillance. Upgrading indicates potential undertreatment of patients with assumed low-risk disease and is associated with worse outcome (21). Several updates of the Gleason grading system such as reporting the most common and highest Gleason pattern on biopsy, re-defining pattern 4 as well as increasing the number of biopsies have been performed in order to improve its prediction accuracy and reduce upgrading (20). We observed upgrading from biopsy to corresponding RP specimen in 44.0% of cases considering the former Gleason Grading, and less common after re-grading according to the ISUP Grade Groups (34.5%) ( Table 5). In accordance with published data, the vast majority of upgrading occurred from tumors with former Gleason Score ≤6 and ISUP Grade Group 1, most often to former Gleason Score 3 + 4 and ISUP Grade Group 2 at RP ( Table 5). Equal observation has been reported by a recently published study showing most frequent upgrades from biopsy ISUP Grade Group 1 to RP ISUP Grade Group 2 (22). Collectively, our results give evidence that the modified definition of pattern 4 reduces upgrading, thus associates with improved predictive accuracy and lower risk of under treatment.
Since patients with highly aggressive disease are underrepresented in our study, correlation between grading and important endpoints such as development of metastasis and PCa specific death is limited. As discussed above, ISUP Grade Group 4 was diagnosed in 7.7% of all RP, limiting statistical significance and informative value within this group. Furthermore, former Gleason Grading of biopsies and RP specimens was performed by pathologists from different institutes which might influence higher incidence of upgrading.
To conclude, our data support the previously accepted ISUP Grade Groups according to the ISUP meeting 2014 as independent prognostic marker for PCa. Diagnosis of Gleason pattern 4 and distinguishing Grade Group 3 from Grade Group 4 remains challenging, considering limitations of this study and general interobserver variability. On RP specimens, we did not observe prognostic benefits of the ISUP Grade Groups compared to the former Gleason Grading. However, in our study the incidence of upgrading from biopsy to corresponding RP was lower by using the ISUP Grade Groups, giving evidence that the ISUP Grade Groups might improve predictive accuracy as assigned on diagnostic biopsies.

eThics sTaTeMenT
Ethical approval for using human material in the present study was obtained from the Internal Review Board of the University Hospital of Bonn (264/11). The study participants were anonymized before their specimens were included to this retrospective study cohort. aUThOr cOnTribUTiOns SP, RK, VL, and AO designed the study approach. SH, FS, CK, JR-I, FB, and AO performed microscopic and histolpathologic investigation. MR, AO, and MH performed statistics. SP, RK, VL, AM, SD, JK, AO, and MH interpreted data. SP and AO wrote the manuscript. All authors reviewed and approved the final manuscript.