A Proposal to Reflect Survival Difference and Modify the Staging System for Lung Adenocarcinoma and Squamous Cell Carcinoma: Based on the Machine Learning

Objective: To propose modifications to refine prognostication over anatomic extent of the current tumor, node, and metastasis (TNM) staging system of non-small cell lung cancer (NSCLC) for a better distinction, and reflect survival differences of lung adenocarcinoma and squamous cell carcinoma. Study Design: Three large cohorts were included in this study. The training cohort consisted of 124,788 patients in the Surveillance, Epidemiology, and End Results (SEER) database (2006–2015). The validation cohort consisted of 4,247 patients from the Zhongshan Hospital, Fudan University (FDZSH; 2005–2014), and People's Hospital, Peking University (PKUPH; 2000–2017). The algorithm generated a hierarchical clustering model based on the unsupervised learning for survival data using Kaplan-Meier curves and log-rank test statistics for recursive partitioning and selection of the principal groupings. Results: In the modified staging system, adenocarcinoma cases are usually at a lower stage than the squamous cell carcinoma cases of the same TNM, reflecting a better outcome of adenocarcinoma than that of squamous cell carcinoma. The C-index of the modified staging system was significantly superior to that of the staging system [SEER cohort: 0.722, 95% CI, (0.721–0.723) vs. 0.643, 95% CI, (0.640–0.647); FDZSH cohort: 0.720, 95% CI, (0.709–0.731) vs. 0.519, 95% CI, (0.450–0.586); and PKUPH cohort: 0.730, 95% CI, (0.705–0.735) vs. 0.728, 95% CI, (0.703–0.753)]. Conclusion: Survival differences between lung adenocarcinoma and squamous cell carcinoma have been reflected accurately and reliably in the modified staging system based on the machine learning. It may refine prognostication over anatomic extent.

INTRODUCTION Non-small cell lung cancer (NSCLC) is one of the most commonly diagnosed and leading causes of cancer death among both men and women worldwide (1)(2)(3). The survival duration underscores the importance of an accurate method to properly predict the prognoses of NSCLC patients to better manage this disease. The American Joint Committee on Cancer (AJCC) and the Union for International Cancer Control (UICC) staging systems of lung cancer using tumor, node, and metastasis (TNM) classification at the time of diagnosis and management, is the most frequently used predictor of survival and indicator of therapeutic strategies planning for NSCLC. The 8th edition of the TNM staging system of NSCLC was published by the International Association for the Study of Lung Cancer (IASLC) in January 2017, and it has been recommended to replace the 7th version (4). Modifying, the 8th TNM classification system for newly NSCLC introduced changes to the classification in both the T and M categories, as well as in the overall stage grouping (4). The upgrading and updating improved the discriminatory ability between adjacent subgroups. However, a heterogeneous aggregate of survival of adenocarcinoma and squamous cell carcinoma has not been discriminated like staging system of esophageal cancer. Recently, several studies have suggested that different prognoses may exist in patients with the same stage of adenocarcinoma and squamous cell carcinoma (5-8). Importantly, the difference between squamous cell carcinoma and adenocarcinoma in prognosis and survival has been evaluated in many studies and should not be ignored in a staging system when developing a more accurate discriminatory ability and prognostic performance in clinical practice (9)(10)(11); especially, the prognoses of some patients in the same substage are different, when some cases with different sub-stages have similar prognoses. To further solve these problems, we aimed to propose modifications to refine prognostication over anatomic extent of the current TNM staging system for NSCLC, by considering the heterogeneity of survival and basing on the machine learning method.

Selection and Description of Participants
The training cohort of patients with NSCLC was from the Surveillance, Epidemiology, and End Results (SEER) database (2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015) of the National Cancer Institute. Only patients with microscopically confirmed squamous cell carcinoma or adenocarcinoma (ICO-O-3 histology/behavior codes 8,050-8,089 and 8,140-8,389, respectively) were included (12). Patients with other variants of lung cancer, such as large cell carcinoma and small cell lung cancer, were excluded. Patients without follow-up information were excluded. Patients who received chemotherapy before surgery (yp cases), or underwent resection for a recurrent lung cancer (r-stage cases) were not considered. All patients included in this study were artificially restaged according to the definitions of the 8th TNM staging system, based on the available clinical and pathological data, both in the SEER database and two validation cohorts.  [2000][2001][2002][2003][2004][2005][2006][2007][2008][2009][2010][2011][2012][2013][2014][2015][2016][2017]. All patients received surgical treatment alone or combined with chemotherapy alone or with radiotherapy. In this study, there were no human subjects involved and only de-identified data were used, thus, ethical review and informed consent were waived by the institutional review board of Zhongshan Hospital, Fudan University. For the analysis of TNM categories presented, all patients were identified via histological and pathology diagnoses of NSCLC and cases with missing staging information or survival status were excluded. Patients were examined every 6 months during the first 2 years and annually thereafter. A physical examination, chest computed tomography scan, and abdominal ultrasound were included in the follow-up protocol. Bone scintigraphy and brain magnetic resonance imaging were performed when relative symptoms appeared.

Statistics
Cancer specific survival (CSS) was defined as the period from the day of diagnosis to the day of death specified by the cancer or related complications. Survival duration was measured from the date of initial diagnosis for clinically staged tumors and from the date of surgery for pathologically staged tumors until the date of death due to the cancer or the date of the last follow-up and calculated by the Kaplan-Meier method. The algorithm generates a hierarchical clustering model based on the unsupervised learning for survival data using the distance matrix of survival curves, which calculated by the χ 2 value of log-rank test with the assumption of patients of each group were equal and infinite, for recursive partitioning and selection of the principal groupings (13) (https://cran.r-project.org/web/views/ Cluster.html). The calculation formula was as follows and the relevant values are calculated. For each time i, let a i and b i be the accumulative survival rate at the period i after diagnosis in the two groups, respectively.
The concordance index (C-index) was used to assess the discriminatory powers of the two staging systems, and the survival calibration curve was calculated to evaluate the calibration of the 8th IASLC staging system and the modified system (14,15). The analysis was implemented using the statistical package R, version 3.4.3 (R Project for Statistical Computing, TUNA Team, Tsinghua University) and Graphpad Prism 7 (GraphPad Software, Inc., San Diego, CA). A p < 0.05 was statistically significant, and all tests were two-sided.  Table 1. In the SEER, FDZSH, and PKUPH cohorts, the proportion of male patients was higher than that of female patients. Consistently, most patients had tumors located at the upper lobe, and there were similar proportions of patients having adenocarcinoma and squamous cell carcinoma in the SEER, FDZSH, and PKUPH cohorts. More than half of the patients in the SEER cohort had moderately differentiated or poorly differentiated tumors. At the same time, the differentiation of the tumors in the FDZSH and PKUPH cohorts was similar to that in the SEER cohort. The 3-year CSS rate of the SEER cohort was 36.7% and the 5-year CSS rate was 29.1%. The 3-year CSS and 5-year CSS rates of the FDZSH s and PKUPH cohorts were 79.0 and 69.2%, and 82.8 and 74.0%, respectively.

Modification of the TNM 8th Staging System
To identify whether patients' data from the SEER cohort for NSCLC was appropriate and accurate, we analyzed the survival of the patients in each stage by the Kaplan-Meier method based on the TNM 8th staging system. Overall, the 5-year CSS rates of stage I to IV patients, were 63.5, 39.  Table 2). Similarly, the HRs for the comparisons among sub-stages were statistically significant ( Table 2), and the 5-year CSS rates of sub-stage IA to IV patients were 66.7, 51.3, 39.8, 38.9, 26.7, 15.0, 13.1, and 5.2%, respectively ( Figure 1C). In contrast, we found that discrimination of survival curves of sub-stages was unsatisfactory in the current 8th TNM staging system, especially in the sub-stage of IIA and IIB (5year CSS rate: 39.8% vs. 38.9%, HR = 0.9687, p = 0.3093) and IIIA and IIIB (5-year CSS rate: 15.0% vs. 13.1%, HR = 0.8931, p = 0.0005; Table 2).
Furthermore, we calculated the survival data of patients with adenocarcinoma or squamous cell carcinoma separately in the SEER cohort, which we had identified, for recursive partitioning and selection of the principal groupings, based on the hierarchical clustering model. Comparing with adenocarcinoma cases, patients with squamous cell carcinoma in the same sub-stage would usually have worse prognoses. For instance, stage Ib patients with adenocarcinoma may carry a similar prognosis as patients with squamous cell carcinoma in stage IA (5-year CSS rate: 56.3% vs. 56.1%, HR = 1.0160, p = 0.6091), and a prognosis between adenocarcinoma cases in stage IIb and squamous cell carcinoma cases in stage IB was similar as well (5-year CSS rate: 43.5% vs. 43.6%, HR = 0.9531, p = 0.1487). Similar results were also found among other sub-stages ( Figure 1C).
Thus, by maintaining the T, N, and M definitions of the current staging system, we regrouped the stages and sub-stages, and proposed a modified stage of the TNN staging system for NSCLC by the unsupervised learning result from the SEER cohort (Figure 2). Definitions of the 7 and 8th editions of AJCC/UICC TNM staging system and the modified staging system were showed in Figure 2.
The 5-year CSS rates of the modified stage I to IV, were 64.1, 34.5, 12.6, and 3.7%, and of the modified sub-stage Ia to IV, were 70.5, 55.0, 40.2, 29.1, 16.7, 12.7, 8.2, and 3.7%, respectively (Figures 1B,D). HRs for comparisons between the modified stage I and stage II, stage II and stage III, and stage III and stage IV, were 0.4003 [p < 0.0001, 95% CI, (0.3903 to 0.4105)], 0.492 [p < 0.0001, 95% CI, (0.4815 to 0.5027)], and 0.6286 [p < 0.0001, 95% CI, (0.6178 to 0.6395)], respectively ( Table 2). Similar findings were also observed in comparisons among the modified substage groups ( Table 2). After the modification, the proportion of patients in stage Ia to IV is compared with the former, shows that the rationality and proportionality. In the modified staging system, more satisfactory discrimination of survival curves of sub-stages was shown, and similar results were detected in the FDUZH and PKUPH cohorts ( Table 2).

Comparison of Survival Outcomes Based on the Current and Modified 8th TNM Staging Systems
Comparing survival curves using the current TNM 8th staging system, the modified staging system indicated improved discrimination of survival curves for all cohorts from the SEER, FDZSH, and PKUPH databases (Figures 1, 3, 4). Accordingly, HRs for the comparisons between stage I and stage II, stage II and stage III, and stage III and stage IV improved substantially in the modified staging system ( Table 2). However, according to the modified staging system, the 5-year CSS rates of stage I to IV patients from the FDZSH cohort were 84.7, 52.9, 26.7, and 14.3%, respectively, and patients from the PKUPH cohort were 87.6, 60.5, 45.0, and 20.2%, respectively (Figures 3B, 4B) Table 2). The modified staging system showed superior discrimination and standardization of survival. Similar results of the 5-year CSS rates and HRs were also identified for the analyses among sub-stages, according to the current staging system, and modified staging system (Figures 3C,D, 4C,D; Table 2).

Discrimination and Calibration Ability of the Current and Modified 8th TNM Staging Systems
C-indices of different staging systems for NSCLC are presented. The C-index of the modified staging system was significantly superior to that of the 8th staging system in all three cohorts.
[SEER cohort: 0.722, 95% CI, (0.721-0.723) vs. 0.643, 95% CI,  Figures 1-4, which showed good agreement with the actual observations for 3-, and 5-year CSS. Thus, both in the discrimination test and calibration test of our modified staging system, the results showed superior predictive ability agreement with the actual observations for 3-, and 5-year CSS.

DISCUSSION
In our study, we used the unsupervised learning method by the deep learning to create a hierarchical clustering model for recursive partitioning and selection of the principal groupings, based on a large study cohort; therefore, the TNM stages with similar survival could be classified as the same group as much as possible. Based on the SEER database, we calculated and rebuilt a modified staging system according to the lung adenocarcinoma and squamous cell carcinoma data. Patients from the SEER database and FDZSH and PKUPH cohorts were   then used to validate the reliability of our modified model, with results indicating that our modified staging system was more accurate in predicting the prognoses of patients with lung adenocarcinoma and squamous cell carcinoma. Likewise, the prognoses of different sub-stages with adenocarcinoma or squamous cell carcinoma differences were better discriminated in our modified staging system. We believe that our method using machine learning for modifying staging could have a positive impact on the effectiveness of prognostic estimation and benefit the staging systems of other cancers, and not only that of NSCLC. It seemed that survival prediction could be improved by machine learning.
For the last 40 years, the AJCC/UICC TNM Staging System of NSCLC has been regarded as the most precise model for the prognostic classification of patients with lung cancer and was well accepted in clinical practice. However, the new edition AJCC/UICC staging system may not be able to resolve the existing controversy regarding the differential survival and prognosis in the same stage of lung squamous cell carcinoma and lung adenocarcinoma and there are still problems of subjective ways for staging in AJCC/UICC TNM staging system. Demonstratively, the discrimination of prognoses of sub-stages, particularly in the sub-stage IIA and IIB, was unsatisfactory in the current TNM staging system, as shown in our analysis.  Frontiers in Oncology | www.frontiersin.org A retrospective study in a large-scale Japanese cohort has identified significant differences in survivals between patients with adenocarcinoma and squamous cell carcinoma with 5year survival rates of 78% in adenocarcinoma patients and 63% in squamous cell carcinoma patients (6). In particular, squamous cell carcinoma patients with stage I disease showed a significantly worse outcome than did adenocarcinoma patients (p < 0.0001), which indicated that different management and prognosis may exist in these patients. Evidence also suggests that lung adenocarcinoma and lung squamous cell carcinoma differ in the composition of genes and molecular characteristics (5, 16), such as EGFR gene mutations. It is noteworthy that outcomes are dynamic and change progressively with the new therapies, surgical, and radiotherapy techniques. The outcomes for patients treated in 2006, with only chemotherapy as the standard of care in advanced disease is not the same that in 2015 with targeted therapies or immunotherapy available. With the increased use of epidermal growth factor receptor-tyrosine kinase inhibitors, the survival rate of lung adenocarcinoma patients has improved substantially (17). However, few effective therapeutic targets for squamous cell carcinoma have been discovered (18)(19)(20)(21)(22).
As stated previously, the problems of subjective ways for staging and difference between lung adenocarcinoma patients and lung squamous cell carcinoma patients in survival cannot be ignored, like the current staging system of esophageal carcinoma, especially for surgeons. Therefore, we proposed to recalculation the TNM staging system for NSCLC, by considering survival differences of adenocarcinoma and squamous cell carcinoma and basing on the machine learning for the survival data for hierarchical clustering, which could have higher prognosis prediction and clinical guidance value for patients with NSCLC. Our results showed that the modified staging system was superior to the current TNM staging system in accuracy and reliability of predicting the prognosis of NSCLC.
Comparing the current TNM staging system, survival curves using the modified staging system were more sufficiently separated among sub-stages. In our new modified staging system, cases of patients with T1N1M0 adenocarcinoma will now be classified as stage IB, reflecting their better outcomes than those of cases involving tumors that remain in stage IIB. Similarly, the category T1N2M0, T2N1M0, and T3N0M0 of adenocarcinoma will move from IIB or IIIA to IIA. In addition, cases of T4N0M0 of adenocarcinoma, T3-4N1M0 of adenocarcinoma, and T2N2M0 of adenocarcinoma will now be classified as IIB, not IIIA, as was the case previously. In addition, T3-4N2M0 of adenocarcinoma and T2-4N3M0 of adenocarcinoma would also shift from IIIC to IIIB, IIIB to IIIA, and from IIIA to IIB. However, cases of T1-2N0M0 of squamous cell carcinoma would now move from IA to IB, IB to IIA, and from IIA to IIB, which reflect their worse outcome than that of cases involving tumors that remain in the original stage. Similarly, the results of the different survival rates of patients with lung adenocarcinoma and squamous cell carcinoma in the same TNM stage have been shown in several studies, which provides strong support for our modified staging system (7,23).
In summary, compared with adenocarcinoma cases, squamous cell carcinoma cases would usually have been at a higher stage than adenocarcinoma cases of the same TNM. However, in the modified staging system, a worse outcome of squamous cell carcinoma than that of adenocarcinoma was noted. It is noticeable that these differences of survival and prognosis are often overlooked in clinical practice. Importantly, clinicians should undertake a comprehensive evaluation of patients with different histological data when they make clinical decisions, especially surgeons. Our results indicated that some cases of T1 with N0 disease but category M1 of both adenocarcinoma and squamous cell carcinoma also shifted from IV to IIIB or IIIC, and similarly, some cases of T2 with N2 disease but category M1 of adenocarcinoma moved from IV to IIIC, which was also noticeable. To a certain extent, we suggested that our results may indicate that compared with other M1 stage types, oligometastasis and M1a metastasis may have a better prognosis, which has been reported and improved in several studies these years (24,25).
Inevitably, this study had several limitations. Our modified staging system was calculated and rebuilt based on the SEER database. Although this analysis included a large study cohort from the SEER database, which was population-based and provided detailed information regarding the patients, the prognosis of similar patients from other countries or ethnicities may be different from our cases. We regretted that we did not have access to the available data of driver oncogenes in the stage IV and data from systemic or radical therapies in early stage. In our study, a new unsupervised learning method was applied, which could provide a more accurate and reliable modified staging system, provided that a wide range of data could be analyzed. Second, the numbers of patients with stage III and stage IV in the FDZSH and PKUPH cohorts were small, because these patients did not receive surgery, which might have reduced the discrimination of the modified staging system to stage III and IV in these two cohorts, while C-indices still showed better predictive ability and discrimination of the modified staging system in stage III and IV. Importantly, we have to admit that the validation cohorts from Fudan and Peking university incorporate only patients who underwent surgical resection, whereas a staging system needs to be applicable to patients managed both surgically and non-surgically. Third, our modified staging system had instructional significance in the differentiation of prognoses; however, it is unclear whether this modified staging system would have better value in clinical practice. Thus, it is necessary to confirm our results using a large multi-institutional database and with multi-center large sample studies. Although our study proposed to reflect the differences of patients with NSCLC according to their different histological data, patients with large cell carcinoma were not considered, because of its low incidence, controversy of WHO classification, and unclear prognosis31. Finally, incorrect coding or erroneous data may have existed in the SEER database, and this source of error would be difficult to identify.

CONCLUSION
The problems of staging and difference between lung adenocarcinoma and squamous cell carcinoma patients in survival should not be ignored when developing a more accurate discriminatory ability and prognostic performance in surgical practice. Differences of survival and more accurate and reliable prognosis in patients have been identified, which may refine prognostication over anatomic extent of TNM staging system. Staging system could be recalculated and improved by machine learning, which could have a positive impact on the effectiveness of prognostic estimation in the next edition TNM stage.

DATA AVAILABILITY
The datasets generated for this study are available on request to the corresponding author.

AUTHOR CONTRIBUTIONS
ML and CZ: substantial contributions to the conception or design of the work, or the acquisition, analysis, or interpretation of data for the work. ML, CZ, and MF: drafting the work or revising it critically for important intellectual content. ML, QW, JW, XS, WJ, YS, XY, CZ, and MF: provide approval for publication of the content. ML, QW, CZ, and MF: agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

ACKNOWLEDGMENTS
We would like to thank International Science Editing Co. for editing the language.