Development and validation of nomograms to predict early death for elderly lung cancer patients

Background Due to the aging of society, the average age of LC (lung cancer) patients has increased in recent years. The purpose of this study was to determine the risk factors and develop nomograms to predict the probability of early death (dead in three months) for elderly (≥ 75 years old) LC patients. Methods Data of elderly LC patients were obtained from the SEER database by using the SEER stat software. All patients were randomly divided into a training cohort and a validation cohort in a ratio of 7:3. The risk factors of all-cause early and cancer-specific early death were identified by univariate logistic regression and backward stepwise multivariable logistic regression in the training cohort. Then, risk factors were used to construct nomograms. The performance of nomograms was validated by receiver operating curves (ROC), calibration curves, and decision curve analysis (DCA) in the training cohort and validation cohort. Results A total of 15,057 elderly LC patients in the SEER database were included in this research and randomly divided into a training cohort (n = 10,541) and a validation cohort (n = 4516). The multivariable logistic regression models found that there were 12 independent risk factors for the all-cause early death and 11 independent risk factors for the cancer-specific early death of the elderly LC patients, which were then integrated into the nomograms. The ROC indicated that the nomograms exhibited high discriminative ability in predicting all-cause early (AUC in training cohort = 0.817, AUC in validation cohort = 0.821) and cancer-specific early death (AUC in training cohort = 0.824, AUC in validation cohort = 0.827). The calibration plots of the nomograms were close to the diagonal line revealing that there was good concordance between the predicted and practical early death probability in the training and validation cohort. Moreover, the results of DCA analysis indicated that the nomograms had good clinical utility in predicting early death probability. Conclusion The nomograms were constructed and validated to predict the early death probability of elderly LC patients based on the SEER database. The nomograms were expected to have high predictive ability and good clinical utility, which may help oncologists develop better treatment strategies.


Introduction
Lung cancer (LC) is one of the most common malignant tumors and causes of cancer-related death in the United States, with an estimated 2.3 million newly diagnosed patients and 1.8 million deaths in 2020 (1). With the rapid development of diagnostic techniques and increased awareness of health examinations, the early detection rate of LC has improved sharply. The prognosis of LC is related to the American Joint Committee on Cancer (AJCC) tumor-node-metastasis (TNM) staging system (2). However, the AJCC staging system does incorporate tumor size, lymph nodes, and metastases, but neglects patient heterogeneity (3). Therefore, a more comprehensive and accurate prognostic model is needed.  The average age of cancer patients has gradually increased due to the rapid aging of the global population (4). According to a singleinstitution study, the systemic therapy rate in elderly LC patients was significantly lower than in younger patients (5). According to a recent study, researchers found that elderly LC patients have a higher early mortality rate (6). Besides that, recent research found that the mortality (8.2 vs. 2.2%) and postoperative problems (26.0 vs. 13.3%) increased significantly in elderly LC patients when compared with younger patients (7). Moreover, the research found that about 70% of LC patients and more than 70% of LC deaths occurred in elderly LC patients (8). Early death is defined as death accorded within three months after diagnosis (6). A deep understanding of the relationship between early death and elderly LC risk factors may help physicians improve the elderly patients' survival and life quality and provide individualized treatment strategies.
The Surveillance, Epidemiology, and End Results (SEER) database is a large-scale database that covers approximately 35% of the United States. population. Nomogram is a tool for precisely predicting cancer patient outcomes that can provide personalized risk estimates (9). In this research, we identified the risk factors of elderly LC patients and constructed and validated the nomograms for predicting the risk probability of elderly LC patients. The nomograms may play an important role in identifying high-risk patients, selecting treatment strategies, and managing follow-up.

Patients and methods
Clinicopathological information of elderly (≥ 75 years old) LC patients from 2010 to 2015 was extracted from the dataset of "Incidence-SEER Research Plus Data, 18 Registries, Nov 2020 Sub (2010-2018)" in the SEER database. The SEER stat software was used to obtain the clinicopathological information (version 8.4.0; http://seer.cancer.gov/seerstat/). The endpoint of this research was all-cause early death and cancer-specific early death. All-cause early death was defined as patients dead in three months because of any cause, and cancer-specific early death was regarded as the time from diagnosis to death due to LC less than three months.
The inclusion criteria were as follows: (1) Age ≥75; (2) Site recode ICD-O-3/WHO 2008: lung and bronchus; (3) patients were one primary tumor only; (4) patients were positive histology confirmed; (5) Patients with complete and clear data on indicators, including sex, race, marital status, laterality, primary site, Grade, AJCC stage, T, N, M, surgery, radiation, chemotherapy, bone metastasis, brain metastasis, pathological type, liver metastasis, survival time and survival state. All patients included in this study were divided into a training cohort and a validation cohort in a ratio of 7:3.

Nomograms development
The univariate logistic regression models of all-cause early and cancer-specific early death contained 17 potential factors. Significantly variables with p <0.05 in univariate logistic regression models were then assessed by backward stepwise multivariable logistic regression in the training cohort. Based on the independent prognostic factors identified by multivariable logistic regression, predictive nomograms were constructed in the training cohort.

Nomograms validation
The training and validation cohorts were used to validate the value of the nomograms. ROC curves were adopted as the main index of discriminative power, the area under the ROC curves (AUC) ranges from 0.5 to 1.0, and the higher the AUC, the

Statistical analysis
Categorical variables were compared by Chi-squared test and described by Number and percentage (n, %), while quantitative data were compared by t-test and described by mean ± standard deviation (SD). All analyses were performed by R software (version 4.0.2), and the p-value (two-tail) < 0.05 was regarded as statistical significance.
As shown in Table 2, a total of 15,057 patients were randomly divided into the training cohort (n = 10,541) and the validation cohort (n = 4516). Results revealed that there has no statistically significant difference between the training cohort and the validation cohort in sex (p = 0.995), race (p = 0.674), marital status

Univariate and multivariate logistic analysis in the training cohort
In order to screen the prognostic factors in elderly LC patients, univariate and multivariate logistic analyses in the training cohort were simultaneously conducted. According to univariate regression analysis ( Table 3), variables including sex, race, marital status, primary site, Grade, AJCC stage, T, N, M, surgery, radiation, chemotherapy, bone metastasis, brain  Frontiers in Surgery metastasis, liver metastasis, and pathological type were significantly associated with all-cause early and cancer-specific early death, with all p < 0.05. Moreover, the backward stepwise multivariate logistic analysis revealed that sex, primary site, grade, AJCC stage, T, N, surgery, radiation, chemotherapy, bone metastasis, brain metastasis, and liver metastasis were independent prognostic factors for all-cause early death, while sex, grade, AJCC stage, T, N, surgery, radiation, chemotherapy, bone metastasis, brain metastasis, and liver metastasis were independent prognostic factors for cancer-specific early death in elderly LC patients, with all p < 0.05 (Table 4).

Nomogram construction and validation
Independent prognostic factors identified by the univariate and multivariate logistic analysis were then included to construct the nomograms. Each variable has the corresponding points, and by FIGURE 1 Nomograms for predicting all-cause (A) and cancer-specific early death (B) in elderly LC patients.  adding the score of each variable, the total point and the  corresponding risk probability of early death can be obtained. For  example, a female patient in the middle lobe with Grade IV, AJCC  stage IV, T3, N0, received chemotherapy only, and with bone metastasis only, the risk probability of all-cause early was 35% approximately. As shown in Figure 1A, chemotherapy contributed most to all-cause early death, followed by surgery, brain metastasis, radiation, T, Grade, AJCC Stage, bone metastasis, primary site, N, liver metastasis, and sex. While chemotherapy contributed most to cancer-specific early death, followed by surgery, brain metastasis, AJCC stage, T, radiation, Grade, bone metastasis, N, liver metastasis, and sex. The validation of nomograms was performed in the training cohort and validation cohort. The results of the ROC analysis indicate that the nomograms exhibited high discriminative ability in predicting all-cause early (AUC in training cohort = 0.817, AUC in validation cohort = 0.821) and cancer-specific early death (AUC in training cohort = 0.824, AUC in validation cohort = 0.827) (Figure 2). The calibration plots of the nomograms were close to the diagonal line revealing that there was a good concordance between the predicted and practical early death probability in the training and validation cohort ( Figure 3). Moreover, DCA analysis was utilized to evaluate the prediction performance of nomograms, and the results indicated that the nomograms had good clinical utility in predicting early death probability (Figure 4).

Discussion
The baseline demographic and clinical characteristics of the patients included in this research were analyzed. Then, risk factors for the early death of elderly LC patients were identified by FIGURE 2 ROC curves for the nomograms in predicting all-cause and cancer-specific early death in the training cohort (A,B) and the validation cohort (C,D).
Li et al. 10.3389/fsurg.2023.1113863 univariate and multivariate logistic analysis. Besides that, prognostic nomograms for all-cause early death and cancer-specific early death of elderly LC patients were constructed and validated. According to the ROC analysis, calibration plots, and DCA analysis, the nomograms exhibited high discriminative ability and good clinical utility in predicting early death probability. The evolution of an aging population is a major factor in the increase in LC (10), and until now the prognosis of elderly LC patients remains a challenge worldwide. Elderly patients have increased medical complications, increased risk of toxic side effects and death, reduced tolerance to social stress, emotional, and physical, and reduced physiological reserve (11). Advanced age has been regarded as an independent risk factor for LC (12). Gen Yu et al. found that advanced age is not only significantly associated with poor overall survival but also negatively affected cancerspecific survival (13). Therefore, this research strengthens attention to elderly LC patients and identifies elderly LC patients who are likely to die early.
Consistent with our study, epidemiological studies revealed that male LC patients have a poorer prognosis than female patients (14), and compared with male patients, female LC patients had higher response rates to chemotherapy and longer survival (15). United States research demonstrated that the risk of death in men LC patients was significantly higher than in women LC patients (16). The difference between males and females may owe in smoking because smoking rates are higher in males than females, and susceptibility to tobacco carcinogens varies by gender (17).
Previous studies have investigated a range of models that predict survival in LC patients, but the selected variables were related rare.  Calibration curves for the nomogram in predicting all-cause and cancer-specific early death in the training cohort (A,B) and the validation cohort (C,D).  (19). In this research, the logistic analysis demonstrated the contribution of 16 factors, and results revealed that elderly LC patients in higher Grade, AJCC stage, Tstage, N-stage, non-surgery, non-radiation, non-chemotherapy, with bone metastasis, brain metastasis, liver metastasis were related to a high risk of all-cause early death and cancer-specific early death. Moreover, our results showed that chemotherapy was the strongest early-death prognostic factor, indicating that chemotherapy can provide survival benefits and patients should be given it if the patient can tolerate the side effects it.
AJCC stage system which contains T-stage, N-stage, M-stage is a classical model in predicting the prognosis of cancer patients (2). Our research demonstrated that AJCC stage, T-stage and N-stage were independent risk factors for all-cause early death and cancer-specific early death of elderly LC patients. T-stage reflects the tumor size, and tumor size is significantly related to the prognosis in a variety of cancers (20-22). However, whether N-stage is a prognosis factor for LC is still a controversial issue. Studies based on the SEER database revealed that the positive number of lymph nodes was correlated with improved overall survival (23,24). An Asian cohort study found that the number of lymph nodes was statistically associated with poor prognosis in LC patients (25). Moreover, Gen et al. found that Nstage had no significant effect on the prognosis of LC patients (13). In this research, we found that N-stage was an independent risk factor for the early death of LC patients. However, all of these studies were retrospective analyses. Therefore, more prospective studies and larger cohort studies needed to be conducted to investigate the relationship between N-stage and prognosis in LC. Besides that, the TNM stage does not incorporate other factors such as treatment, histology et al., that may associate with prognosis (26). Therefore, it's of great importance to construct a more comprehensive prognostic model. Nomogram is a precise prognostic model and can incorporate numerous variables, which may help oncologists find high-risk patients and develop better treatment strategies. Decision curve analysis (DCA) for the nomograms in predicting all-cause and cancer-specific early death in the training cohort (A,B) and the validation cohort (C,D).

Frontiers in Surgery
From 2010 on, the SEER database begins to contain distant metastasis data, including bone metastasis, brain metastasis, lung metastasis, and liver metastasis. Distant metastases have been found to be the main cause of mortality in LC patients. Bone is one of the most common malignant tumor metastasis sites, and its distinctive microenvironmental status is beneficial to the development and proliferation of tumor cells (27). The previous study found that bone and liver metastases were independent prognostic factors for LC patients (28) and brain metastasis was associated with shorter survival (29), which was consistent with our research. LC patients with brain metastases can lead to neurological dysfunction, seriously endangering the patient's survival (30). Research has found that the overall early mortality of lung cancer was only 27.5% and the mortality of brain metastasis patients up to 44.4%, which indicated the poor prognosis of brain metastasis LC patients.
However, this research has some limitations. First of all, the SEER database does not contain data on molecular pathologic markers. These factors may be effective supplements to the predictive models, and these will be the main part of further studies. Besides that, this is a retrospective study, and the conclusions need to be verified by prospective and larger studies.

Conclusion
It is of great importance to identify prognostic factors associated with the early death of elderly LC patients and knowledge of predictive nomograms is helpful for clinicians to identify high-risk patients, design individual-based treatments, and develop better treatment strategies.

Data availability statement
Publicly available datasets were analyzed in this study. This data can be found here: http://seer.cancer.gov/.