Novel Nomograms Individually Predicting Overall Survival of Non-metastatic Colon Cancer Patients

Background: This study aimed to develop an effective prognostic nomogram for predicting non-metastatic colon cancer. Methods: The Surveillance, Epidemiology, and End Results program was utilized to analyze patients who underwent surgical therapy (25,350 for training, 10,860 for validation). Nomograms were created depending upon multivariate analysis in the training cohort and were compared to current American Joint Committee on Cancer (AJCC) classifications. Areas under the receiver-operating characteristic curves (AUCs), Akaike's information criterions (AICs), and calibration curves were used. The clinical benefit was measured using decision curve analyses (DCAs). The validation cohort was used to validate the results. Results: Nomogram 1 included age, gender, histological grade, T stage, number of retrieved lymph nodes, tumor size, and N stage. Nomogram 2 included age, gender, histological grade, T stage, number of retrieved lymph nodes, tumor size, and number of positive lymph nodes. The prognostic discrimination of nomogram 1 (AUC, 0.729, 95% CI, 0.723–0.736) was better than that of nomogram 2 (AUC, 0.704, 95% CI, 0.698–0.710, p < 0.001) in five-year overall survival in the training cohort. Nomogram 1 (AIC, 137,319) also showed superior model-fitting compared to nomogram 2 (AIC, 137,453). Similarity, nomogram 1 was better than the AJCC 6th and 8th TNM classifications. DCA revealed that nomogram 1 had a superior net benefit than other models. These findings were validated using the validation cohort. Conclusions: The proposed nomogram 1 was a better prognostic prediction model with better discrimination and superior model-fitting for patients with non-metastatic colon cancer, which might prove to be clinically helpful.


INTRODUCTION
Colon cancer is the third most commonly diagnosed cancer among both males and females in the United States (1). Although some progress has been made in the therapy of colon cancer in the past decades (2,3), local recurrence and distant metastasis remain a challenge for clinicians (4). The accuracy of survival prediction for patients is critical for postoperative treatment decisions and surveillance. Therefore, tools necessary to provide prognosis for colon cancer patients are critical for helping medical professionals consult and advise patients on their treatment options.
The American Joint Committee on Cancer (AJCC) tumornode-metastasis (TNM) staging system is the current gold standard for risk assessment (5). It is the most basic and common staging system for evaluating prognosis for colon cancer patients undergoing surgery. For colon cancer patients who are not distantly metastatic, the TNM staging system is determined by two factors: the degree of entry into the intestinal wall and the number of locoregional positive lymph nodes (6). In fact, because of the clinicopathological features and tumor biology variations, the outcomes are quite different, so it is assumed that patients in each group have homogenous results (7). In addition, the classification nature of the TNM staging scheme forces continuous variables into categorical variables, might further limiting prediction accuracy (8). It is increasingly recognized that in addition to the TNM staging system, other clinical factors may contribute significantly to individual predictions of prognosis, such as age, histological type, degree of differentiation, systemic inflammation, and nutritional status (9, 10).
A nomogram is an effective tool for visualizing regression models used to quantify individual risk by including multiple important prognostic factors. It has been shown to achieve a good predictive performance in a variety of cancers. Previous studies in pancreatic cancer (11), uveal melanoma (12), hepatocellular carcinoma (13), and intrahepatic cholangiocarcinoma (14) have shown that nomograms could improve predictive accuracy and provide patients and physicians with a more comprehensive outcome measure when making treatment-related decisions.
In previous studies focusing on colorectal cancer, nomograms were applied to predict overall survival, disease-related survival prognosis, risk of recurrence and metastasis, as well as adjuvant chemotherapy. In the development of the nomograms, some studies used number of positive lymph nodes as a continuous variable (8,(15)(16)(17)(18), while others used it as a categorical variable (19)(20)(21)(22)(23). However, few studies have applied both the AJCC N stage and number of positive lymph nodes (continuous and categorical variables) to develop nomograms, and to compare their discriminations, model-fittings, and net benefits in predicting overall prognosis.
This study aimed to create a prognostic model of colon cancer depending upon independently prognostic factors of Cox proportional-hazards models. Predictive utility of the nomogram was further compared to the AJCC 6th and 8th TNM staging systems (24). This nomogram is expected to provide more personalized prognostic predictions that will help clinicians and patients make better treatment choices.

Statistical Analyses
Baseline clinical variable characteristics between training and validation cohorts were compared with Student's t-tests or Mann-Whitney U tests. Survival curves were depicted using the Kaplan-Meier methods with log-rank tests. Nomograms were developed depending upon prognostic factors of multivariate Cox proportional hazards models.
The predictive discriminations of nomograms and current AJCC TNM classifications were assessed using areas under the receiver-operating characteristic curves (AUCs). The Hanley and McNeil tests were then used to compare the AUCs. The Akaike's information criterion (AIC) (26) and calibration curve (27) were applied to evaluate the nomogram prediction model-fitting. Higher AUCs indicated better discrimination and lower AICs indicated superior model-fitting. The calibration curves were assessed by reviewing the predicted versus actual probabilities. A perfectly accurate classification would result in a calibration curve where most observed and predicted probabilities fall along the 45-degree line. In addition, clinical benefit was measured using decision curve analyses (DCAs) (28,29).
All data were analyzed using the SPSS 22.0 statistical package (SPSS Inc., Chicago, IL, USA), MedCalc (Version 15.2, Ostend, Belgium), and R version 3.5.6 (http://www.r-project.org/). All tests were two-sided and p-values < 0.05 were considered statistically significant. The authors signed a data use agreement with SEER. The approval of an institutional review board was not required as the SEER database holds publicly available deidentified data.

Clinicopathologic Characteristics
Numbers of patients excluded at each step during patient selection process is shown in Figure 1. Patients were categorized according to the AJCC 6th and 8th TNM staging systems. Demographic and clinical characteristics of the training and validation cohorts are shown in Table 1. Baseline characteristics of the validation cohort were similar to the training cohort (Student' t-test or Mann-Whitney U test, p > 0.05 for all).

Nomogram 1
Depending upon univariate analysis, statistically significant factors including age, gender, histological grade, T stage, number of retrieved lymph nodes, and tumor size were identified as independent prognostic factors and were included in multivariate analysis of Cox proportional hazards models together with the N stage. Significant factors in the multivariate analysis were further incorporated into nomogram 1 (Table 3, Figure 2A). The results were similar in the validation cohort (Supplementary Table S2).

Nomogram 2
According to univariate analysis, statistically significant factors including age, gender, histological grade, T stage, number of retrieved lymph nodes, and tumor size were included in multivariate analysis of Cox proportional hazards models together with the number of positive lymph nodes. Significant factors in the multivariate analysis were further incorporated into nomogram 2 ( Table 3, Figure 2B). The results were similar in the validation cohort (Supplementary Table S2).

Three-and Five-Year Overall Survival of AJCC 6th and 8th Staging Systems
In the training cohort analyzed with the AJCC 6th TNM staging system, the three-and five-year Overall Survival (OS) of stages IIA, IIB, and IIIA was 91.1 and 85.3%, 79.4 and 70.2%, and 92.5 and 87.8%, respectively. The prognosis of stage IIIA was better than that of stages IIA and IIB (Log-rank test, p < 0.05 for all, Figure 3A). The results were similar in the validation cohort ( Figure 3C).
In the 8th TNM staging system, the three-and five-year OS of stages IIA, IIB, IIC, and IIIA was 91.1 and 85.3%, 81.7 and 72.6%, 76.7 and 67.3%, and 92.4 and 87.4%, respectively. The prognosis of Stage IIIA was better than that of stages IIA, IIB, and IIC (Logrank test, p < 0.05 for all, Figure 3B). The results were similar in the validation cohort ( Figure 3D).
In the training cohort, the prognostic discrimination of nomogram 1 was better than of nomogram 2 (Hanley and McNeil test, all p < 0.001, Table 4) in three-and five-year OS. Nomogram 1 also showed a superior model-fitting compared to nomogram 2 according to the AICs and calibration curves (Table 4, Figures 5A-D). Similar findings were validated in the validation cohort.

Comparison of Predictive Performance Between AJCC 6th and 8th TNM Staging Systems
In the training cohort, the AUCs for the AJCC 8th TNM staging system for three-and five-year OS were 0.703 (95% CI, 0.697-0.709) and 0.695 (95% CI, 0.689-0.701), respectively, with the AIC for OS of 138,404 (Table 4, Figures 4E,F). The AUCs for the AJCC 6th TNM staging system at three-and five-year OS  Figures 4G,H).
In the training cohort, the prognostic discrimination of the AJCC 8th classification was better than of the AJCC 6th classification (Hanley and McNeil test, all p < 0.001, Table 4) in three-and five-year OS. The AJCC 8th classification also showed superior model-fitting compared to the AJCC 6th classification according to the AICs ( Table 4). Similar findings were validated in the validation cohort.

Comparison of Predictive Performance Between Nomogram 1 and AJCC 8th TNM Staging Systems
In the training cohort, the prognostic discrimination of nomogram 1 was better than the AJCC 8th classification (Hanley and McNeil test, all p < 0.001, Table 4) in three-and five-year OS in the two cohorts. Nomogram 1 also showed superior modelfitting compared to the AJCC 8th classification according to the AICs in the two cohorts ( Table 4). Similar findings were validated in the validation cohort.

Comparison of Clinical Usefulness Between Nomograms and AJCC TNM Staging System Using Decision Curve Analyses
Using the decision curve analyses (DCAs) for both training and validation cohorts, nomogram 1 showed better net benefits with wider ranges of threshold probabilities and improved performance than nomogram 2 and AJCC 6th and 8th TNM staging systems for predicting three-and five-year OS in colon cancer patients (Figures 6A-D). These results represent a superior estimation of decision outcomes at higher threshold probability levels.

DISCUSSION
Accurate predictions of the prognosis for colon cancer patients are critical for further postoperative treatment and follow-up planning. Traditionally, the survival outcome of postoperative colon cancer patients is predicted based on the AJCC TNM staging system. Since the 1940s, the TNM staging system has    Revisions of the staging system were modified every six to eight years and, until recently, it has been considered the most comprehensive tool for predicting the prognosis and predictive grouping of colon cancer patients. However, when the AJCC 6th TNM staging system was released in 2002 (24), its accuracy was questioned because the survival rate of patients with stage IIIA was superior to that of patients with stage IIB colon cancer (30). The AJCC 7th edition released in 2010 (31) has been staging more accurately than the AJCC 6th edition for improving the prognosis. However, the AJCC 7th edition has not eliminated survival discrepancies between stages II and IIIA colon cancers. The AJCC 8th edition (6) released in 2017 showed no changes in stages I-III compared to the AJCC 7th TNM staging system. A similar issue was observed in the current study, where the AJCC 6th staging system did not satisfactorily stratify patients between stages II and III. Patient prognosis with stage IIIA was better than that with stage II. The 8th TNM staging system made the staging more elaborate compared to the 6th TNM classification. However, it still does not do a good job at stratifying patients between stages II and III. In this study, prognostic nomograms based on the results of the Cox proportional-hazards model were developed and validated to predict survival probabilities in patients undergoing surgery for non-metastatic colon cancer. Compared to the 6th and 8th editions of the AJCC staging system based on the depth of infiltration and the number of positive lymph nodes, a nomogram can integrate various prognostic factors to make more personalized predictions for patients. Age, gender, histological grade, T stage, number of retrieved lymph nodes, tumor size, N stage, and number of positive lymph nodes were integrated into the nomogram. Many researchers have also shown that these clinicopathological factors are associated with the prognosis of colon cancer patients (32). It should also be noted that the number of retrieved lymph nodes was an independent factor in the prognosis of colon cancer. Many previous studies have shown that it is also an independent prognostic factor for many other malignancies and the larger number of lymph nodes removed meant a better survival prognosis (33,34). Perhaps the most important reason is that as the lymph nodes are more extensively removed, more potentially positive lymph nodes will not be missed, providing enough positive lymph nodes to be used for precise staging.  In the establishment of the nomogram, some researchers have used the number of positive lymph nodes as a continuous variable (8,(15)(16)(17)(18), while others used it as a categorical variable (19)(20)(21)(22)(23). Few researchers have used both the AJCC N stage and number of positive lymph nodes as variables to develop nomograms and to compare their accuracy in predicting prognosis in their studies. The present study developed a nomogram using these variables and evaluated the accuracy in predicting prognosis. The results showed that the nomogram including the AJCC 8th N stage had a better survival prediction accuracy than the nomogram including the number of positive lymph nodes. The nomogram incorporates clinically common pathological factors and provides a more personalized prognostic prediction than the AJCC staging systems. In addition, nomograms have better clinical benefits and other researchers have achieved the same results in other oncology studies. Through this novel and easy-to-implement scoring system, personalized survival prognosis predictions after surgery can be easily obtained. Identifying colon cancer patients with different survival risks based on the nomograms may have an impact on further treatment or follow-up plans.
Using the SEER (25) data allows to draw reasonable conclusions consistent with general clinical practice based on a large sample number of colon cancer patients, which is impossible to achieve in a single institutional study. However, this study had some limitations that should be concerning. First, even if the SEER database was regularly checked for discrepancies, it has been reported that its accuracy is 98% and the possibility of incorrect coding or erroneous data still exists. In addition, other potentially prognostic factors including lymphatic vessel invasion, marginal status, surgical procedures, postoperative complications, laboratory indices, and chemotherapy data were not used. More well-known predictors for improving model performance should be applied. Besides, the current study was limited by its retrospective nature, although it was based on a large database. Furthermore, the currents study was based on a Western database of SEER program (25,35), and further cohorts from Eastern countries are still needed to validate our findings.

CONCLUSIONS
In summary, this study developed a prognostic nomogram for patients with non-metastatic colon cancer. The nomogram improves the estimates provided by the current AJCC 8th TNM staging system and can more accurately estimate the survival rate for individual patients after surgery. It might be useful for medical professionals to develop further treatment options and long-term follow-up plans for patients undergoing colon cancer surgery.

DATA AVAILABILITY STATEMENT
The datasets analyzed during this current study are available in SEER database (https://seer.cancer.gov/) to extract the eligible cases. The data are also available from the corresponding author on reasonable request.

ETHICS STATEMENT
Ethical approval was not needed as 3rd party data from the SEER database was used.

AUTHOR CONTRIBUTIONS
J-PP, C-DZ, and D-QD conceived and designed the study. J-PP and YL analyzed the data. J-PP, C-DZ, CZ, and K-ZW wrote the paper. D-QD and Z-MZ reviewed the draft. All authors read and approved the final manuscript.