Development of a Survival Prognostic Model for Non-small Cell Lung Cancer

Lung cancer is a leading cause of cancer-related death, and >80% of lung cancer diagnoses are non-small-cell lung cancer (NSCLC). However, when using current staging and prognostic indices, the prognosis can vary significantly. In the present study, we calculated a prognostic index for predicting overall survival (OS) in NSCLC patients. The data of 545 NSCLC patients were retrospectively reviewed. Univariate and multivariate Cox proportional hazards regression analyses were performed to evaluate the prognostic value of clinicopathological factors. Age (hazard ratio [HR] = 1.25, 95% confidence interval [CI] = 1.02–1.54), TNM stage (III, HR = 1.64, 95% CI = 1.08–2.48; IV, HR = 2.33, 95% CI = 1.48–3.69), lung lobectomy (HR = 1.96, 95% CI = 1.45–2.66), chemotherapy (HR = 1.42, 95% CI = 1.15–1.74), and pretreatment hemoglobin level (HR = 1.61, 95% CI = 1.28–2.02) were independent prognosticators. A prognostic index for NSCLC (PInscl, 0–6 points) was calculated based on age (≥65 years, 1 point), tumor-node-metastasis (TNM) stage (III, 1 point; IV, 2 points), lung lobectomy (no, 1 point), chemotherapy (no, 1 point), and pretreatment hemoglobin level (low, 1 point). In comparison with the “PInscl = 0” subgroup (survival time = 2.71 ± 1.86 years), the “PInscl = 2” subgroup (survival time = 1.86 ± 1.24 years), “PInscl = 3” subgroup (survival time = 1.45 ± 1.07 years), “PInscl = 4” subgroup (survival time = 1.17 ± 1.06 years), “PInscl = 5” subgroup (survival time = 0.81 ± 0.78 years), and “PInscl = 6” subgroup (survival time = 0.65 ± 0.56 years) exhibited significantly shorter survival times. Kaplan-Meier survival analysis showed that patients with higher PInscl scores had poorer OS than those with lower scores (log-rank test: χ2 = 155.82, P < 0.0001). The area under the curve of PInscl for predicting the 1-year OS was 0.73 (95 % CI = 0.69–0.77, P < 0.001), and the PInscl had a better diagnostic performance than the Karnofsky performance status or TNM stage (P < 0.01). In conclusion, the PInscl, which is calculated from age, TNM stage, lung lobectomy, chemotherapy, and pretreatment hemoglobin level, significantly predicted OS in NSCLC patients.

A systematic review (7) of 887 articles and our previous study (8) revealed that there are 169 different clinical and laboratory parameters (including pretreatment hemoglobin and Abbreviations: NSCLC, non-small cell lung cancer; TNM, tumor-nodemetastasis; EGFR, epidermal growth factor receptor; ALK, anaplastic lymphoma kinase; ROS1, c-ros oncogene 1; BRAF, v-raf murine sarcoma viral oncogene homolog B1; OS, overall survival; GPS, Glasgow prognostic score; ROC, receiver operative characteristic; KPS, Karnofsky performance status; LPHb, low pretreatment hemoglobin; NPHb, high pretreatment hemoglobin; HR, hazard ratio; CI, confidence interval; AUC, area under the curve; NPV, negative predictive value; PPV, positive predictive value; IASLC, International Association for the Study of Lung Cancer. carcinoembryonic antigen levels, performance status, sex, weight, metastases, etc.) and molecular prognostic factors that affect survival in NSCLC patients. However, these clinical and laboratory parameters are inconsistent and not commonly used in clinical practice or trial design. Further, assessing molecular prognostic factors such as EGFR, ALK, ROS1, BRAF, and p53 mutation are not only time-consuming but also expensive. Therefore, a practical prognostic model for predicting overall survival (OS) in NSCLC patients is needed. Many prognostic models incorporating various parameters have been reported. These models include the Glasgow prognostic score (GPS) (9), modified GPS (9), laboratory prognostic index (10), and advanced lung cancer inflammation index (11), all of which use serum parameters assessed in routine laboratory tests, but not clinical parameters. Further, Blanchon et al. assessed the prognostic ability of multiple variables, including age, sex, performance status, histological type, and TNM stage, and developed a validated prognostic index (12) in which performance status and TNM stage played major roles.
In the present study, we retrospectively reviewed data from 545 NSCLC patients and calculated a prognostic index (PInscl) for predicting OS in NSCLC patients based on age, TNM stage, lung lobectomy, chemotherapy, and pretreatment hemoglobin levels. The prognostic value of the PInscl was evaluated with receiver operating characteristic (ROC) curve analysis and compared with those of the Karnofsky performance status (KPS) and TNM stage.

Patients
All case records of patients with lung cancer admitted to the Huaihe Hospital of Henan University (Henan, China) from May 1, 2010 to June 30, 2017 were analyzed. The inclusion criteria were: (1) NSCLC newly diagnosed at the Huaihe Hospital; (2) histologically or cytologically confirmed NSCLC; and (3) staged according to the TNM staging system (13). Exclusion criteria were: (1) small cell lung cancer; (2) insufficient clinical data; (3) insufficient laboratory data; (4) clinical evidence of active infection or inflammation; (5) hematological disease; (6) pulmonary embolism, acute myocardial infarction, or cerebrovascular accident within 1 month diagnosis. After excluding 191 ineligible patients, 545 patients with NSCLC were selected for the present study (Figure 1). This study was carried out in accordance with the recommendations of the Medical Ethics Committee of Huaihe Hospital, Henan University. The protocol was approved by the Medical Ethics Committee of Huaihe Hospital. All subjects gave written informed consent in accordance with the Declaration of Helsinki.
Data were retrospectively collected from the patients' case records, including demographic information (age, sex, cigarette smoking, alcohol consumption, and family history of cancer), date of diagnosis and death (obtained from the patients' medical records, local death registration departments, and telephone follow-ups), cancer stage at the time of diagnosis (according to the 8th Edition of the TNM Classification for Lung Cancer) (13), KPS score (≥80 indicated that the patient was able to live and work with mild symptoms or signs and <80 indicated that the patient was unable to live and work normally) (14), therapeutic method (obtained from the patients' medical records), and pretreatment hemoglobin levels [<120 g/L was defined as low pretreatment hemoglobin (LPHb) in men and <110 g/L was defined as LPHb in women according to the normal reference range of hemoglobin in the Chinese population].

Follow-Up
Patients with NSCLC were followed from the date of diagnosis to the date of death or June 25, 2017, whichever came first. OS for each patient was defined as the number of days from the date of diagnosis to the date of death or final follow-up. Personyears were calculated for each subject. Treatments were initiated upon diagnosis and the treatment methods were not exclusive; a patient may have undergone lobectomy, chemotherapy, and radiation simultaneously.
LPHb were categorized into the reference group and the observed group, with hazard ratios (HR) and 95% confidence intervals (CI) being calculated to estimate associations between the observed factors and OS in patients with NSCLC. After discarding the insignificant factors in the multivariate analysis, the final Cox model included age, TNM stage, lung lobectomy, chemotherapy, and pretreatment hemoglobin. Between two prognostic factors, an interaction effect was tested using multivariate analysis. For  each enrolled item, proportionality was estimated using the Schoenfeld and scaled Schoenfeld residuals.
We developed a PInscl that included age, TNM stage, lung lobectomy, chemotherapy, and pretreatment hemoglobin based on the results of the final Cox model. Age ≥ 65 years, TNM stage III, not undergoing lung lobectomy, not receiving chemotherapy, and having LPHb were given 1 point; TNM stage IV was given 2 points. The minimum PInscl score was 0 and the maximum PInscl score was 6 (Supplementary Table 1). The OS, HR, and 95% CI were calculated for each PInscl score. Associations between PInscl score and OS were evaluated using the Peto-Peto-Prentice test. Survival curves were generated using the Kaplan-Meier method, and the log-rank test was used to examine differences in OS between patients with different PInscl scores.
The discriminatory ability of the PInscl score was tested by assessing the area under the ROC curve (AUC). Further, the AUC of PInscl was compared with those of the KPS and TNM staging using the DeLong test (15). In addition, we calculated the sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV) of the prognostic score.
All statistical analyses were performed using Stata software version 13 (Stata Corporation, College Station, TX, USA). P < 0.05 was considered to indicate a statistically significant difference for all analyses.

DISCUSSION
The results of the present study highlighted the importance of prognostic models in estimating prognosis in NSCLC patients. Our prognostic model, the PInscl, was based on age, TNM stage, lung lobectomy, chemotherapy, and pretreatment hemoglobin level. The PInscl had a statistically significant discriminative ability to predict OS. Further, the PInscl had a statistically better diagnostic performance than the KPS score or the TNM stage for 1-5 year OS ( Table 5). This might be because the PInscl included other factors, making it more comprehensive and sensitive.
In previous studies, age has been recognized as a prognostic factor for NSCLC using cut-off values of 80, 75, 70, and even 50 years (16)(17)(18)(19). In the present study, age <65 years was associated with a longer survival time in both univariate (HR = 1.42, 95% CI = 1.18-1.73) and multivariate (HR = 1.23, 95% CI = 1.00-1.52) analyses. We also analyzed age as a continuous variable, but it was not significantly correlated with OS. The TNM staging system, which classifies cancer according to the size and extension of the primary tumor, its lymphatic involvement, and the presence of metastases, is frequently used in clinical practice to predict prognosis (20). Its reliability has been fully established through the IASLC (International Association for the Study of Lung Cancer) study (21). In our present study, stage III (HR = 1.64, 95% CI = 1.08-2.48) and stage IV (HR = 2.33, 95% CI = 1.48-3.69) disease were indicative of a poorer prognosis ( Table 3). However, as the coefficient of the TNM stage was not more than two times those of other factors in multivariate analysis (data not shown), we did not emphasize it in our model, as Blancoon et al. did (12).
Anemia is linked to prognosis, and hemoglobin has long been recognized as a prognostic factor for NSCLC patients (22)(23)(24)(25). We found that hemoglobin <120 g/L in men and <110 g/L in women was associated with a shorter OS (HR = 1.62, 95% CI = 1.29-2.03).
In many cases, lung lobectomy is still the most effective treatment method for NSCLC (26). The impact of minimally invasive lobectomy and thoracotomy lobectomy on survival has also been assessed (27). However, lobectomy will be applied according to the clinical situation for NSCLC patients (28). In the present study, surgical resection was not recommended for stage IV patients. Therefore, although we found that lung lobectomy was an independent prognostic factor for NSCLC patients, we cannot say whether a physical condition suitable for lobectomy, lobectomy itself, or both contributed favorably to OS. Regardless, lung lobectomy was an independent prognostic factor in the model.
Chemotherapy is another major treatment method for NSCLC (29), and more chemotherapies have become clinically available (30). We found that chemotherapy was an independent prognostic factor in both univariate and multivariate analysis. This result was in line with those of previous studies (8,31) However, patients received both cisplatin-and paclitaxel-based chemotherapies, and we did not divide the patients into subgroups, which may have affected the results. Chemotherapy, particularly cisplatin-based adjuvant chemotherapy, might also improve survival among patients with completely resected NSCLC (32). Although we could not exclude its potential long-term influence, we did not find a significant synergistic effect of chemotherapy and lung lobectomy (data not shown). This study has several strengths. First, the PInscl can be simply calculated and used in almost all NSCLC patients. Data on age, TNM stage, lung lobectomy, chemotherapy, and pretreatment hemoglobin are easy to obtain and do not require exhaustive testing and complicated biological examination. Second, it is practicable. We could predict OS simply by the PInscl score, which is meaningful for patients, their families, and clinicians. ROC curve analysis showed that the PInscl score was a fairly predictable index and was more sensitive than the KPS and TNM score. However, the study also has limitations. First, selection bias may be a concern due to the monocentric design of the study and the absence of random sampling, even though exhaustive inclusion of consecutive cases over 5-years should alleviate the bias. Second, the discriminative power of the PInscl was not assessed in a population with features different from that in which it was derived. Third, the model does not include mutational information (e.g., EGFR/ALK mutations). Fourth, the lack of a validation cohort might weaken the power of the present study. Therefore, whether it is suitable to be expostulated to other NSCLC populations needs further verification.
By developing this simple prognostic index, we suggest that the PInscl, which is calculated from age, TNM stage, lung lobectomy, chemotherapy, and pretreatment hemoglobin level, might significantly predict OS in NSCLC patients.

DATA AVAILABILITY STATEMENT
The datasets analyzed in this article are not publicly available.
Requests to access the datasets should be directed to Yuquan Lu (lll3923@gmail.com).

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Medical Ethics Committee of Henan University, Huaihe Hospital. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
Y-HZ made substantial contributions to data collection and was a major contributor in writing the manuscript. YL analyzed and interpreted the data contributed to manuscript preparation and revision and gave final approval for the version to be published. HL was responsible for the acquisition of data and institutional review board application, conducted data interpretation, and gave final approval for the version to be published. Y-MZ agreed to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.