Development and Validation of a Prognostic Nomogram for Predicting Cancer-Specific Survival in Patients With Lymph Node Positive Bladder Cancer: A Study Based on SEER Database

Purpose To construct a prognostic model to predict the cancer-specific survival (CSS) for bladder cancer patients with lymph node-positive. Patients and Methods We enrolled 2,050 patients diagnosed with lymph node-positive bladder cancer from the Surveillance Epidemiology and End Results (SEER) database (2004–2015). All patients were randomly split into development cohort (n = 1,438) and validation cohort (n = 612) at a ratio of 7:3. The univariate and multivariate Cox regression analysis were performed to identify prognostic factors. A nomogram predicting CSS was established based on the results of multivariate Cox analysis. Its performance was evaluated by calibration curves, the receiver operating characteristic (ROC) curves, and the concordance index (C-index). Internal verification was performed in the validation cohort. The Kaplan–Meier method with the log-rank test was applied in the different risk groups. Results The nomogram incorporated summary stage, tumor size, chemotherapy, regional nodes examined and positive lymph nodes. The C-index of the nomogram in the development cohort was 0.716 (0.707–0.725), while the value of the C-index was 0.691 (0.689–0.693) in the validation cohort. The AUC of the nomogram was 0.803 for 3-year and 0.854 for 5-year in the development cohort, while was 0.773 for 3-year and 0.809 for 5-year in the validation cohort. Calibration plots for 3-year and 5-year CSS showed good concordance. Significant differences were observed between high, medium, and low risk groups (P <0.001). Conclusions We have established a prognostic nomogram providing an accurate individualized probability of cancer-specific survival in bladder cancer patients with lymph node-positive. The nomogram could contribute to patient counseling, follow-up scheduling, and selection of treatment.


INTRODUCTION
Bladder cancer (BC) is a common malignancy globally, with an estimated 500,000 new cases and 200,000 deaths worldwide in 2018 (1,2). In addition, bladder cancer is also a severe and heterogeneous disease with a poor prognosis, especially for those patients with lymph node-positive (3). A retrospective study showed that approximately 25-30% of BC patients undergoing radical cystectomy presented with lymph node-positive after pathologic examination. Moreover, only a 25% disease-free survival rate was observed in these patients (4). Several retrospective studies had confirmed the poor prognosis of the higher recurrence and poorer survival rate in node-positive patients compared with those without (4)(5)(6)(7). For example, a survey demonstrated that up to 70-80% of node-positive patients experienced disease recurrence, while this data was only 30% in patients with negative pathological nodes.
Over the course of the past years, some urologists were committed to stratifying patients with lymph node metastasis because a few studies suggested that a part of node-positive patients was still potentially curable (8). Jensen revealed better prognosis was observed in patients with a single node-positive compared with those patients with multiple (9). Meanwhile, a more prolonged overall survival (OS) and cancer-specific survival (CSS) were seen in patients staged N1 in comparison to patients with more extensive node involvement, according to the results of some retrospective studies (10). All these studies intended to meticulously stratify node-positive patients and pick out patients with better prognosis to take more suitable treatment. The eighth edition TNM system of the American Joint Committee on Cancer (AJCC), which divided nodepositive patients into N1, N2, and N3 stages was used widely to simply evaluate the prognosis (2). However, lack of high accuracy and vital tumor characteristics like the number of positive nodes were its limitations. When compared with the conventional TNM system, a few studies suggested that the number of positive nodes seemed to be a more promising predictor of the outcome for node-positive BC patients (3,11). Thus, it is imperative to build an exact model to evaluate the prognosis of BC patients with node-positive.
Nomogram is a visible and trustworthy statistical prediction tool, which was utilized widely to provide tailored individual prognostic information. Nomogram was composed of fundamental variables like demographics, tumor characteristics, and treatment features (12). Rink had constructed a nomogram that included gender, T stage, margin status, LN-density, and adjuvant chemotherapy to predict recurrence and cancer-specific survival for patients with a single lymph node metastasis (13). Meanwhile, a nomogram integrating multiple molecular markers was constructed to access disease recurrence and cancer-specific mortality for BC patients with locally advanced and node-positive (14). However, these models failed to obtain high accuracy (C-index 0.63 and 0.66 for Rink's model) and incorporate variables not easily available. Moreover, the models were not specially designed for all bladder cancer patients with positive lymph nodes. To our knowledge, it is the first study to construct a prognostic nomogram to predict cancer-specific survival (CSS) in all node-positive patients.
In our study, we searched patients with node-positive and collected all information available from the Surveillance, Epidemiology, and End Results (SEER) database from 2004 to 2015. We were committed to establishing a prognostic nomogram that incorporated significant factors to estimate the CSS and make direct decisions on treatment for those patients with node-positive. In addition, the performance of the nomogram was evaluated, and an assessment of applicability with internal verification was also performed in this study.

Variables Defined and End Point
The variables in the selected cohorts included: demographic characteristics (age, sex, race, marital status), tumor characteristics (tumor size, grade, histology, T stage, N stage, summary stage), treatment information (chemotherapy and radiotherapy), and other variables (regional nodes examined and positive lymph nodes). The prime endpoint in this study was cancer-specific mortality (CSM), which referred to the death of bladder cancer.
For conveniently analyzing, we had processed some variables in the SEER database. Some continuous variables, namely, age, tumor size, regional nodes examined and positive lymph nodes were transformed into categorical variables: age (<60, 60-70, 70-80, >80); tumor size (<3 cm, ≥3 cm); positive lymph nodes (1, 2-10, >10). Sex was divided into male and female, and race included white, black, others which contained American, Indian, Alaska, Native, Asian, and Pacific Island. We defined marital status as married, separated, divorced or widowed (SDW), and single. Our study only was committed to the common histology with transitional cell carcinoma (TCC) and papillary transitional cell carcinoma (PTCC). Grades I and II were combined, considering the small sample size. According to the sixth edition of the AJCC stages, precise information on the TMN system was recorded in this study.

Statistical Analysis
We randomly split the study population into development and validation cohorts based on the ratio of 7:3. The Student's t-test and Chi-square test were performed for continuous and categorical variables, respectively, to explore the baseline characteristics of patients in the two groups. Categorical variables were presented as frequencies and their proportions, while continuous variables were the mean ± Standard Deviation (SD). In the development cohort, the univariate Cox regression analysis was applied to recognize potential significant prognostic factors. They were incorporated in the multiple Cox proportional hazards regression model when their P-value was under 0.05. All results were shown as hazards ratios (HR) and 95% confidence intervals (95%CI).
A nomogram incorporated the selected variables from the multiple Cox model, and the critical P-value was 0.05. the nomogram was built for visualized prediction of 3-and 5-year survival probability in the development cohort. We used Harrell's concordance-index (C-index) and the receiver operating characteristic (ROC) curves with the calculated area under the curve (AUC) to assess the performances of the model. Moreover, the consistency of predicted and actual outcomes of 3-and 5-year survival time was evaluated by the calibration plots, and it was performed with the package of rms in Rstudio. Patients in the development cohort were divided into three levels of risk group based on the total obtaining points. Meanwhile, the Kaplan-Meier method with the log-rank test was applied to analyze the differences of CSS between the three risk groups. SPSS 22.0 (IBM Corp, Armonk, NY) and R version 3.6.3 (https://cran.rproject.org/bin/windows/base/old/3.6.3) were utilized for all statistic analysis.

Characteristics of Study Population
Finally, 2,050 patients with lymph node-positive were enrolled in our study, and 1,438 patients (70%) were distributed into the development cohort while 612 patients (30%) into the validation cohort. Baseline demographical and clinicopathological characteristics of the study population are shown in Table 1. There were statistical differences between development and validation cohorts on the grade (P = 0.013), and patients in the development cohort tended to have a higher proportion of distant stage (43.9% vs 30.6%, P <0.001). Statistical differences on other variables between the two groups were failed to observe. The 3-and 5-year CSS rates were 43.17% (n = 885) and 37.56% (n = 770) in total cohort, respectively, while 43.6% (n = 627) and 37.83% (n = 544) in the development cohort, respectively. The mean survival time was 34.16, 35.12, and 31.9 months in the total cohort, development cohort, and validation cohort, respectively.

Prognostic Factors of Node-Positive Patients in Development Cohort
Ultimately, five factors, namely, summary stage, tumor size, chemotherapy, regional nodes examined and positive lymph nodes were selected from the multivariate cox model.  Table 2).

Prognostic Nomogram for OS
A nomogram predicted the 3-and 5-year CSS of node-positive patients based on the Cox regression models ( Figure 1). All variables in the nomogram were assigned a corresponding score of 0 to 100 based on the contribution to this nomogram ( Table 3). Each patient could obtain a total score by adding scores in every subgroup. The nomogram revealed that the summary stage was the most significant contributor to the prognosis model of CSS.

Validation of the Nomogram
The C-index of this nomogram for CSS was 0.716 (0.707-0.725) in the development cohort, which was more significant than 0.605 of the TNM system (P <0.05). Meanwhile, the discriminative ability of the nomogram was evaluated by ROC curves. The AUC of the nomogram was significantly higher than the TMN system both for 3-year (0.803 vs 0.675) and 5-year (0.854 vs 0.669) CSS prediction (all P <0.05) ( Figures 3A, B). The calibration plots of the development cohort for 3-year and 5year all demonstrated good agreement between actual observations and predicted outcomes (Figures 2A, B) All these results suggested that better performance of our model in comparison to the traditional TNM system. In addition, internal verification of the nomogram was performed in the validation cohort to evaluate the applicability. The C-index of this nomogram was 0.691 (0.689-0.693), and AUC was 0.773 and also 0.809 for 3-year and 5-year, respectively ( Figures 3C, D). The calibration curve of the validation cohort all gained good correlation between nomogram prediction and actual outcomes, especially for 5-year prediction. The results of internal validation suggested that this nomogram had satisfying applicability for node-positive patients (Figures 2C, D).

Survival Curve for Nomogram
All variables in the nomogram have authorized a score based on the contribution to the CSS, and we provided a corresponding score of 3-year and 5-year cancer-specific mortality probability, respectively. The lymph node-positive patients were divided into three risk subgroups according to the total points obtained: Low risk group: >198; medium risk group: 148-198; high risk group: <148. As Figure 4 showed, significant differences in CSS were observed between the three risk subgroups (P <0.001).

DISCUSSION
Lymph node-positive bladder cancer was considered as a severe stage associated with a high recurrence rate and mortality rate (8,14). However, a part of patients with node metastasis still could be curable after active treatment (13). In addition, with the development of the treatment for bladder cancer patients, a lot of novel treatments such extend lymph node dissection, neoadjuvant chemotherapy, and targeted molecular therapy were proposed, and they acquired better prognosis possible for part node-positive patients (8,14,15). However, the prognostic stratification for patients with node-positive is still lacking.
Therefore, it is urgent to establish an accurate and suitable predictive model for patients with lymph node metastasis. This study comprehensively explored the effect of all factors available in the SEER database in CSS in node-positive patients. Meanwhile, we constructed and internally validated a relatively accurate and discriminating nomogram for the prediction of CSS by incorporating variables from the multivariate cox model. This   approach produced a relatively easy and accurate tool, which only incorporated the significant variables associated with survival outcome but without sacrificing accuracy. The final survival nomogram yielded highly accurate prediction far exceeded the accuracy of individual predictors. In addition, the other advantage of nomogram over standard multivariate regression model was providing the individual probability of survival outcome at specific time points instead of a relative risk concept. Meanwhile, using Harrell's concordance index, which was a global measure of model accuracy to evaluate the accuracy of the nomogram, was also the advantage compared to conventional Cox regression models (12,(16)(17)(18). Furthermore, different levels of risk groups could be constructed based on the points of the nomogram, and individual patient counseling and follow-up scheduling were tailored for different risk groups (16).
We had compared our nomogram with the traditional AJCC TNM classification on clinical performance by the C-index and AUC. The results showed that our model obtained a greater Cindex and AUC composed to the TNM system in the development cohort. Bruins et al. retrospectively enrolled 146 node-positive patients to evaluate the effect of the TNM system and failed to obtain differences on overall survival and diseasefree survival (DFS) between patients staging N1-3 (19). Meanwhile, Jensen had constructed a nomogram based on 381 pN1 patients, namely, gender, T stage, margin status, LN density, and adjuvant chemotherapy. However, only focusing on pN1 patients and excluding patients with neoadjuvant chemotherapy limited its applicability in all node-positive patients. Moreover, the C-index of the model was 0.66 and 0.63, respectively, and it seemed to not be enough to satisfy the accuracy of the model (9). A nomogram in the combination of multiple molecular markers incorporating p53, pRB, p21, and p27 applied for predicting recurrence and cancer-specific survival (CSS) in pT3-4 or nodepositive patients (14). Nevertheless, adding the molecular markers to the model failed to significantly improve the performance of outcome prediction (3.9% for recurrence, 4.3% for CSS) (20). Moreover, the application of molecular marker was still limited on account of ambiguously effect and expensive  cost. The nomogram in this study had a great clinical performance in CSS prediction and variables incorporated relatively easily accessible in most hospitals. In detail, the good discriminative ability and accuracy of the nomogram were confirmed with the relatively high C-index and AUC of 3-year and 5-year in development and validation cohorts. The calibration curves also revealed a perfect consistency between the prediction of the nomogram and the actual outcome. This novel nomogram for CSS probability prediction incorporated five factors, which included summary stage, regional nodes positive, tumor size, regional nodes examined, and chemotherapy. Studies suggested that a number of positive nodes seem to be a more promising predictor of outcome in node-positive patients than the conventional TNM system (3). In addition, some researchers found significant differences in disease outcome between patients with one and more nodes positive (9,13,21). Meanwhile, a retrospective study with 244 node-positive patients obtained the result that the 10-year disease-free survival rate in patients with eight or fewer positive nodes was significantly higher than those with greater than eight positive nodes. The degree of the number of positive nodes had been confirmed to strongly associate with prognosis in node-positive patients. Furthermore, receiving chemotherapy was shown as a protective factor for patients with node-positive. Several retrospective studies enrolling bladder cancer patients with node-positive observed higher overall survival and cancer-specific survival rates in patients with chemotherapy than those patients without (9,15,21). Therefore, chemotherapy might be a suitable and meaningful treatment for patients with node-positive. Several significant advantages were worth noting in this study. First, it is the first study, up to our knowledge, to perform a prognostic nomogram for the prediction of CSS for all bladder cancer patients with lymph node-positive. Then, the number of patients in this study was relatively great and enough to construct a prognostic nomogram with good performance (n = 2,050). Finally, the variables in the nomogram were easily available in most hospitals, and the good applicability was obtained in our nomogram. Meanwhile, we divided study population into three risk groups based on the prognostic nomogram, and it was easier to detect patients with worse  survival outcomes. Nevertheless, some limitations in this study should be noticed. First of all, this is a retrospective study based on the SEER database, which means the results of this study were inevitably influenced by selection biases. In addition, we excluded patients with unknown variable information, and it was also a significant source of selection biases. Second, there were some limitations in the SEER database. Such as the SEER database collected massive information of patients from multiple regions and hospitals, and it seemed impossible to balance the differences in treatment and pathological evaluation standards. Moreover, some vital factors like drugs of chemotherapy and course of treatment of radiotherapy, which were also vital for node-positive patients, were lacking in the SEER database.
Simultaneously, novel treatment such as target therapy is a growing field, and they need more research to verify the effect (8). Finally, although internal verification was performed in the validation cohort, the result of this verification method was not perfect because the patients in the development and validation came from the same database. Therefore, a large prospective clinical trial was demanded for external validation.

CONCLUSION
The study based on the SEER database revealed several demographics, lymph node characteristics, and therapeutic features, which were significantly associated with the cancerspecific survival of bladder cancer patients with lymph nodepositive. A prognostic nomogram was constructed and validated to predict the individualized probability of cancer-specific survival at the time of 3-and 5-year. The nomogram could contribute to patient counseling, follow-up scheduling, and selection of treatment. Nonetheless, external and prospective validation was demanded for widely applying.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material. Further inquiries can be directed to the corresponding authors.

ETHICS STATEMENT
The data from SEER is publicly available and de-identified. This study was approved by the institutional. This study was approved by the institutional of the First Affiliated Hospital of Nanchang University.