Risk factors, survival analysis, and nomograms for distant metastasis in patients with primary pulmonary large cell neuroendocrine carcinoma: A population-based study

Introduction Pulmonary large cell neuroendocrine carcinoma (LCNEC) is a rapidly progressive and easily metastatic high-grade lung cancer, with a poor prognosis when distant metastasis (DM) occurs. The aim of our study was to explore risk factors associated with DM in LCNEC patients and to perform survival analysis and to develop a novel nomogram-based predictive model for screening risk populations in clinical practice. Methods The study cohort was derived from the Surveillance, Epidemiology, and End Results database, from which we selected patients with LCNEC between 2004 to 2015 and formed a diagnostic cohort (n = 959) and a prognostic cohort (n = 272). The risk and prognostic factors of DM were screened by univariate and multivariate analyses using logistic and Cox regressions, respectively. Then, we established diagnostic and prognostic nomograms using the data in the training group and validated the accuracy of the nomograms in the validation group. The diagnostic nomogram was evaluated using receiver operating characteristic curves, decision curve analysis curves, and the GiViTI calibration belt. The prognostic nomogram was evaluated using receiver operating characteristic curves, the concordance index, the calibration curve, and decision curve analysis curves. In addition, high- and low-risk groups were classified according to the prognostic monogram formula, and Kaplan–Meier survival analysis was performed. Results In the diagnostic cohort, LCNEC close to bronchus, with higher tumor size, and with higher N stage indicated higher likelihood of DM. In the prognostic cohort (patients with LCNEC and DM), men with higher N stage, no surgery, and no chemotherapy had poorer overall survival. Patients in the high-risk group had significantly lower median overall survival than the low-risk group. Conclusion Two novel established nomograms performed well in predicting DM in patients with LCNEC and in evaluating their prognosis. These nomograms could be used in clinical practice for screening of risk populations and treatment planning.


Introduction
In 1991, the first report of pulmonary large cell neuro endocrine carcinoma (LCNEC) occurred. (1) In 2015, LCNEC was removed from the pathological classification of the large cell carcinomas and placed under pulmonary neuroendocrine neoplasms (NENs), which was revised by the World Health Organization and carried over to the latest 2021 edition. (2,3) LCNEC is an uncommon pathologic type that accounts for 3% of all lung malignancies. (4) Recent reports indicated that its incidence increased year by year, from 0.01/100,000 people in 1990 to 1.8/100,000 people in 2010, with its annual mortality doubling between 2004 and 2015. The survival time and rates for patients with stage I-III LCNEC were close to those of patients with non-small cell lung cancer (NSCLC), whereas those of patients with stage IV were more like those of patients with small cell lung cancer (SCLC). (4,5) In NENs, LCNEC is similar to SCLC and is a high-grade, rapidly progressing, easily metastatic malignancy. (6) The incidence of brain metastases in patients with LCNEC was significantly higher than in patients with SCLC or NSCLC. (4, 7) Sex, age, primary tumor site, TNM stage, surgery status, and chemotherapy have been shown to be independent risk factors for the prognosis of LCNEC in previous studies. (8)(9)(10)(11)(12) However, there is still controversy around the clinical management and treatment of LCNEC, such as using radiotherapy, and there are no standardized treatment approaches, especially for patients with LCNEC and distant metastasis (DM). As a result, a novel clinical predictive model is needed to assess the risk variables for incidence and prognosis of LCNEC with DM so that early intervention may be provided to this high-risk population.
Nomograms have been widely utilized in the prognostic analysis of cancer because of its capacity to graphically and intuitively show risk factors related to prognosis. (13, 14) Moreover, it has been used to assess metastasis in patients with osteosarcoma, with good results. (15) Therefore, we used the Surveillance, Epidemiology, and End Results (SEER) database to access public data and to evaluate the risk variables related to DM in patients with de novo (primary) LCNEC and to conduct further prognostic analysis. We present here two nomograms that can be used.

Study population
We obtained data from SEER * Stat software v8. 3 (1): multiple primary tumors (2), age < 18 years, and (3) pathology grade I or II (low-grade), as the LCNEC is a high-grade neuroendocrine carcin oma. In ad dition, relevant clinicopathological characteristic, including age, sex, race, primary tumor site, laterality, pathology grade, TNM stage, and tumor size were required to be available. We enrolled 959 patients in the diagnostic cohort (Supplementary Figure 1A) and further excluded (1): patients with no DM and (2) survival time < 1 month, and (3) patients for whom data on surgery, chemotherapy, or radiotherapy status were not included. Then, 272 patients were enrolled in the prognostic cohort (Supplementary Figure 1B).
The study population was randomly split into training (70%) and validation groups (30%) in the diagnostic cohort, with a 7:3 ratio. The training and validation groups of the prognostic cohort were derived from the diagnostic cohort without regrouping. In each cohort, to investigate the factors associated with the incidence and the prognosis of DM in LCNEC, we created two nomogram models and performed a survival analysis. Both models were constructed using the training groups and validated using the validation groups.

Variables collected
As part of the diagnostic cohort, the following variables were assessed: sex, age, race, laterality, T stage, N stage, primary site, pathological grade, and tumor size. In the prognostic cohort, variables included those in the diagnostic cohort as well as surgery, chemotherapy, and radiotherapy status. Further subgroup analysis was performed in the diagnostic and prognostic cohorts to prepare for nomogram establishment. Meanwhile, survival analysis was conducted in the prognostic cohort, with overall survival (OS) as the primary endpoint; OS was defined as the period from the initial diagnosis of LCNEC and death of any cause.

Statistical analysis
The study cohort was randomly grouped to form a training and a validation group. All variables were reclassified as categorical variables, and the clinicopathological characteristics of LCNEC patients were compared using the Chi-squared test in the training and validation groups, some using Fisher's exact test.
In the diagnostic cohort, we utilized logistic regression analysis to analyze risk variables of DM in patients with LCNEC patients. First, the univariate analysis was performed, with a two-sided P < 0.05 regarded as statistically significant, to identify risk factors. Then, significant variables were incorporated into the multivariable risk model, and odds ratios (ORs) and 95% confidence intervals (CIs) were computed. The independent risk factors selected by the model were then incorporated into a nomogram for visualization and clinical predictive analysis. Finally, we compared the novel nomogram with each individual risk variable using the receiver operating characteristic (ROC) curves of the training and validation groups and computed the area under the curve (AUC) to assess the validity of the novel nomogram. Decision curve analysis (DCA) and GiViTI calibration belt were used to assess the reliability of the nomogram.
In the prognostic cohort, the risk variables for OS in patients with LCNEC with DM were assessed using Cox proportional hazards regression analysis. The variables with statistical significance (2-sided P < 0.05) from the univariate analysis were applied to the multivariable analysis to screen individual risk factors related to prognosis. Hazard ratios (HRs) and 95% CIs were also calculated at the same time. According to the results of the univariate and multivariable analyses, a prognostic nomogram was established. The validity of the nomogram was assessed using the concordance index (C-index), as well as timedependent ROC curves at 1, 2, and 3 years, based on the nomogram and individual prognostic risk factors. The reliability of the nomogram was evaluated using DCA curves and the calibration curves at 1, 2, and 3 years. All validations were carried out in the training group and the validation group. In addition, the nomogram algorithm was used to determine the individual risk score of risk variables. Based on the median risk score, the prognostic cohort was separated into high-and lowrisk groups to prepare for the survival analysis. The Kaplan-Meier method was applied to assess the OS of the two risk groups, and the log-rank test was used to obtain P-values in the training and validation groups.

Baseline characteristics of the diagnostic and prognostic cohorts
Our study included two major study cohorts, the diagnostic cohort of patients with LCNEC and the prognostic cohort of patients with LCNEC with DM. Baseline characteristics are presented in Tables 1 and 2. Among the 959 patients with LCNEC patients, patients were most commonly elderly and male. The most common primary tumor site was in the lung, and the most common tumor size was ≤3 cm. Among the tumor stages, T2 and N0 were the most common, and 308 patients had DM (M1). In the diagnostic cohort (Table 1), all patients with LCNEC were randomized into a training and a validation group. The Chi-squared test (some using Fisher's exact test) revealed no significant differences in any of the covariates between the two groups, indicating that the grouping was completely random. The training and validation groups had mean ages of 64.93 years (range, 18-92 years; interquartile range, 58-72 years) and 65.62 years (range, 45-90 years; interquartile range, 59-73 years), respectively. In the prognostic cohort (Table 2), the grouping was entirely consistent with the diagnostic cohort. There were 272 patients with LCNEC with DM, most of them were elderly men as before. The most common tumor stages were T4 and N2. The primary tumor site was still common in the lung. For treatment, most patients received chemotherapy and radiotherapy, but few underwent surgery.

Diagnostic predictive model of DM in patients with LCNEC
In the diagnostic cohort, the results of logistic regression analysis are shown in Table 3. First, a univariate analysis found five variables that may be associated with DM in LCNEC, including tumor size, primary tumor site, T stage, N stage, and sex. However, we excluded the T stage as it could be contradictory to clinical practice. These variables were then further incorporated into a multivariable analysis, which ultimately revealed three independent risk factors associated with DM, namely, a primary site of LCNEC closer to bronchus, and patients were more likely to have DM with larger tumor size and higher N stage. Additionally, to access  the risk of DM, three independent risk variables were combined into a novel diagnostic predictive model, and a nomogram was generated in the training group ( Figure 1A). Then, the ROC curves were drawn, with AUCs of 0.761 and 0.773 for the training and validation groups, respectively ( Figures 1B, E). DCA both in the training and validation groups ( Figures 1C, F) demonstrated the reliability of the nomogram. Moreover, we plotted the GiViTI calibration belts, which showed that the 95% CI did not cross the diagonal bisector at 45 degrees, and the P-values for the training and validation groups were 0.101 and 0.065, respectively ( Figures 1D, G), indicating that the nomogram was reliable for predicting DM. (18) Meanwhile, for each individual risk factor, ROC curves were created, and the diagnostic nomogram outperformed any single factor in the training and validation groups (Figures 2A, B).

Prognostic predictive model of patients with LCNEC with DM
In the prognostic cohort, univariate and multivariable Cox proportional hazards regression analyses were performed to search for factors linked with OS in patients with LCNEC with DM (Table 4). Four variables were selected, including sex, N stage, surgery, and chemotherapy status. Specifically, male sex, no surgery, no chemotherapy, and a higher N stage were independent risk factors, highly associated with worse OS. Then, in the training group, we created a prognostic nomogram based on these four risk variables ( Figure 3) and validated it in the validation group. First, in patients with LCNEC with DM, the nomogram could be utilized to predict  Figure 6B). Together with a concordance index of 0.723, on the one hand, these results confirmed the validity of the prognostic nomogram, and, on the other hand, the nomogram seemed to be better at predicting long-term survival. Additionally, the ROC curves of the prognostic nomogram were compared to those of all individual risk variables, and it was shown that the prognostic nomogram outperformed any single factor at 1, 2, and 3 years in the training (Figures 7A-C) and validation groups (Figures 7D-F).

Outcomes of survival analysis
According to the prognostic nomogram, we then utilized the Kaplan-Meier method to evaluate the OS of both the high-and low-risk groups. Median survival time in the high-group and low-risk groups was 4 and 11 months, respectively, in the training group ( Figure 8A), and 4 months and 10 months, respectively, in the validation group ( Figure 8B). Compared to the low-risk group, the high-risk group had significantly lower OS (training group, p<0.0001; validation group, p=0.00057).

Discussion
Pulmonary LCNEC shows a high prevalence of lymph node metastases (60-80%) and DM (40%) at the time of diagnosis, with a median survival time for individuals with pulmonary LCNEC who develop DM of about five months. (19,20) Therefore, we must take effective measures to diagnose DM of LCNEC as early as possible to provide appropriate treatment time. In the present study, to screen for high-risk groups, we developed two nomograms for the diagnostic and prognosis analysis of patients with LCNEC with DM and categorized them according to the risk score produced by the model. First, the larger the primary tumor and the closer the tumor to bronchus, the more likely it was to metastasize. Second, the prognosis of patients with LCNEC who had DM was improved by surgery and chemotherapy, but it was worse in male patients than in female patients. Third, regional lymph node metastasis was a significant risk factor affecting the prognosis of patients with LCNEC, which was related to the occurrence of DM and the prognosis of patients with LCNEC patients with DM.

Diagnostic cohort
Recently, studies focusing on clinical characteristics and prognosis of LCNEC have been published. Lowczak et al. showed that LCNEC, as with SCLC, was frequently associated with male sex, heavy smoking, and advanced age (median age of 65 years). The area under the receiver operating characteristic curves (AUCs) were compared for the diagnostic nomogram in the training group (A) and validation group (B) with all independent variables, including N stage, primary site, and tumor size. individuals with pulmonary LCNEC underwent surgery, chemotherapy, or radiation therapy, aggressive and effective treatment could increase survival time dramatically. (22) This corresponds with our study, which showed patients with LCNEC were more commonly older men. However, because the SEER database lacks smoking information, we did not evaluate the relationship between smoking and DM. Interestingly, in the present study, although in the prognostic cohort, the majority of LCNEC patients with DM are still elders (≥60 years old, nearly 70%), age was not a risk factor for DM; this requires further The decision curve analysis (DCA) curves at 1 (A), 2 (B), and 3 years (C) and the calibration curves at 1 (D), 2 (E), and 3 years (F) in the training group were used to evaluate the reliability of the prognostic nomogram. The decision curve analysis (DCA) curves at 1 (A), 2 (B), and 3 years (C) and the calibration curves at 1 (D), 2 (E), and 3 years (F) in the validation group were used to evaluate the reliability of the prognostic nomogram.

FIGURE 6
The time-dependent receiver operating characteristic (ROC) curves at 1, 2, and 3 years in the training group (A) and in the validation group (B) were used to evaluate the validity of the prognostic nomogram.
The area under the receiver operating characteristic curves (AUCs) were compared for the prognostic nomogram in the training and validation groups with all independent variables, including Sex, N stage, Surgery, and Chemotherapy at 1 (A, D), 2 (B, E), and 3 years (C, F).

FIGURE 8
Survival outcomes in the training group (A) and validation group (B) for the high-risk and low-risk groups (according to the prognostic nomogram formula).

Prognostic cohort
In our study, sex was not a factor associated with the development of DM in patients with LCNEC, but was a factor affecting the prognosis of those with DM. The prognosis of male patients was worse than that of women. Recent studies found that lifestyle, tobacco use, secondhand smoke exposure, several occupational exposures, treatment type received, duration of anticancer treatment after diagnosis, endogenous circulating levels of sex hormones, and expression and mutation rates of several related genes (including EGFR, KRAS, and P53) had differences between men and women, and that sex differences have important implications for lung cancer development, prognosis, and treatment preferences. (27, 28) In addition, one immunohistochemistry marker, the Ki-67 proliferation index (PI), may have an effect on the prognosis of LCNEC, and recent studies have shown that Ki-67 PI≥55% was strongly associated with poor survival. (29,30) Hermans et al. showed that patients with stage IV LCNEC with a solitary brain metastasis and N0/N1 disease more commonly had a Ki67 PI ≤ 40%, and these patients had better prognosis than those with Ki67 PI>40%. (31) However, Walts et al. suggested that a blanket use of 20%, 40%, or any other Ki-67 cut-off to diagnose LCNEC or analyze prognosis was inaccurate. (32) Unfortunately, the lack of Ki-67 data in the SEER database prevented further exploration in the present study, and it is hoped that large multicenter studies will be available to assess this.
Regarding treatment modalities, although previous studies have explored the treatment of LCNEC, the results were limited, contradictory, and rare for patients with LCNEC with DM. In our analyses, to investigate the positive effects of surgery and chemotherapy on patients with LCNEC with DM, we used multivariate Cox regression analysis and survival analysis, but, due to limited information in the SEER database, we were unable to conduct further analysis. The main findings of previous studies are as follows. First, primary surgical treatment significantly improved survival in patients with LCNEC patients, even in those with stage IV. (10) However, LCNEC had a high postoperative recurrence rate, with more than half relapsing within one year, although the R0 resection margin and N0 status (no lymph node metastasis) improved the time to recurrence. (33) As a result, even for LCNEC patients with an earlier stage, surgery alone was insufficient. (34) Second, chemotherapy alone could be more beneficial than other treatments, even for patients in stage IV. ( (38) Therefore, the relationship between gene expression and treatment regimens requires further study. Third, there is still controversy about radiotherapy. On the one hand, radiotherapy could prolong the survival of patients with LCNEC, including those in stage IV, especially those who have received chemotherapy or have not undergone surgery. (39) However, radiotherapy may shorten the survival time of individuals undergoing surgery. (40)(41)(42) On the other hand, it is interesting to note that the metastatic pattern of LCNEC is similar to NSCLC, but the prognosis is similar to that of SCLC. (43)The brain was the most common metastatic site, so prophylactic cranial irradiation is an effective treatment and might be improve survival time. (44) In patients with LCNEC with brain metastases, stereotactic radiosurgery is superior to whole brain radiation treatment. (45) Moreover, Girelli et al. reported that patients with LCNEC with lymph node metastasis had a poor prognosis, and more active multidisciplinary approaches were needed. (46) Overall, surgery combined with chemotherapy may be an appropriate treatment for LCNEC with DM, especially in patients with regional lymph node metastasis.

Advantages and shortcomings
Previous studies on patients with LCNEC with DM were limited, and most of them were single-center studies with a lack of validation. The advantages of the present study are that the data came from the SEER database, the sample size was large, and the follow-up period was long. We created an entirely new nomogram for visualization, to predict independent risk factors for the occurrence and prognosis of DM in individuals with LCNEC, that could be used for screening high-risk patients and guiding personalized treatment in clinical practice.
Nevertheless, there are some shortcomings in the present study. First, the number of patients with LCNEC with DM was only 272, and as this was a retrospective study, this may have led to potential bias. Second, although our nomograms have been internally validated in both the training and validation groups, more data is needed to determine the wider applicability of the external validation model. Third, there is a lack of key information in the SEER database that may be relevant to survival, for example, smoking history, performance status, tumor biomarkers status, genetic testing results, specific treatment modality; these data can help further refine our model. In particular, the recent increase in use of immunotherapy and targeted therapy in lung cancer may offer new hope for patients with LCNEC with DM. Kim et al. showed that the PD-1/PD-L1 pathway was found to be activated in the LCNEC microenvironment and associated with a high mutation burden. (47) Vrontis et al. suggested that treatment and management of patients with advanced LCNEC could be achieved with SCLC approaches, such as platinum-etoposideatezolizumab chemotherapy, which can improve prognosis. (48) Additional prospective randomized controlled studies are needed.

Conclusions
In the present cohort study, individual risk variables and prognostic factors for DM in patients with LCNEC were identified using two regression analysis approaches and related variables were applied to establish a new predictive model and perform further survival analysis. Meanwhile, two novel nomograms were developed, including a diagnostic nomogram and a prognostic nomogram, and these could be reliable tools for clinical screening of risk populations and for optimizing treatment.

Ethics statement
This study was exempt from the approval processes of the Institutional Review Boards because the SEER database patient information is de-identified. Therefore, a patient consent form was not applicable.