A nomogram model to individually predict prognosis for esophageal cancer with synchronous pulmonary metastasis

Background Esophageal cancer (EC) is a life−threatening disease worldwide. The prognosis of EC patients with synchronous pulmonary metastasis (PM) is unfavorable, but few tools are available to predict the clinical outcomes and prognosis of these patients. This study aimed to construct a nomogram model for the prognosis of EC patients with synchronous PM. Methods From the Surveillance, Epidemiology, and End Results database, we selected 431 EC patients diagnosed with synchronous PM. These cases were randomized into a training cohort (303 patients) and a validation cohort (128 patients). Univariate and multivariate Cox regression analyses, along with the Kaplan-Meier method, were used to estimate the prognosis and cancer-specific survival (CSS) among two cohorts. Relative factors of prognosis in the training cohort were selected to develop a nomogram model which was verified on both cohorts by plotting the receiver operating characteristic (ROC) curves as well as the calibration curves. A risk classification assessment was completed to evaluate the CSS of different groups using the Kaplan-Meier method. Results The nomogram model contained four risk factors, including T stage, bone metastasis, liver metastasis, and chemotherapy. The 6-, 12- and 18-month CSS were 55.1%, 26.7%, and 5.9% and the areas under the ROC curve (AUC) were 0.818, 0.781, and 0.762 in the training cohort. Likewise, the AUC values were 0.731, 0.764, and 0.746 in the validation cohort. The calibration curves showed excellent agreement both in the training and validation cohorts. There was a substantial difference in the CSS between the high-risk and low-risk groups (P<0.01). Conclusion The nomogram model serves as a predictive tool for EC patients with synchronous PM, which would be utilized to estimate the individualized CSS and guide therapeutic decisions.


Introduction
Esophageal cancer (EC) is a commonly diagnosed malignant tumor, ranking seventh and sixth respectively in terms of incidence and mortality (1). Indeed, over half of the EC patients are diagnosed with metastatic or unresectable disease at their first visit (2,3). Distant lymph nodes, lung, and liver are the most common sites for EC metastases (4,5). A recent study reported that 50% of EC patients could develop pulmonary metastases (PM) (6). Therefore, a prognosis evaluation is required for therapy and follow-up.
The prognosis of EC patients with synchronous PM is notoriously unfavorable, but few reports describe the cancerspecific survival (CSS) of these patients. Although some prior clinical studies reported that surgical resection and stereotactic body radiotherapy for PM from EC could be an option of personalized treatment (7)(8)(9), there is still little evidence about the standard treatment of palliative regimen for EC patients with synchronous PM (10). As a result, the majority of current therapies are based on clinical experience and literature, which means that no accessible method is available to predict the prognosis of these patients.
The nomogram, designed for an individual patient to predict mortality risks in a variety of diseases, is a widely used graphical prediction tool (11). It contains both pathological factors and clinical risk factors, such as metastatic sites, T stage, and treatments (12). This tool could provide clinicians with the survivability of patients and therefore assist them in developing better treatment strategies, such as clinical trials and hospice care. Hence, this study, through the clinical and pathological information from the SEER database, sought to establish a nomogram model for a personalized assessment of EC patients with synchronous PM.

Data extraction
Patient data were extracted from the Surveillance, Epidemiology, and End Results (SEER) database, which serves as a population-based cancer registry system summarizing data from fourteen states throughout the United States, accounting for almost 35% of the American population. Patients (i) diagnosed with EC between 2010 and 2015, (ii) confirmed to have pulmonary metastasis at initial diagnosis, and (iii) aged 18-100 years were enrolled in the study. And the patients (i) with multiple primary cancer, and (ii) with missing or incomplete data (such as metastatic sites, T stage, N stage, grade, primary tumor size, radiation, surgery, and race) were excluded. EC patients without other distant organ metastases, such as lung, liver, and brain metastases, were also excluded from this study. The primary tumor was confirmed histopathologically. The primary site was mainly determined by surgical resection. The pathological grade and type of esophageal cancer were obtained by further analysis of pathological specimens. Metastasis of the primary tumor depends on the pathology or imaging diagnosis. All patients' clinical and pathological as well as demographic data were analysed retrospectively. Informed consent was not required due to the anonymization characteristic of the SEER database.

Study population and follow-up
The CSS was defined as the time elapsed between diagnosis and EC-related death or termination of follow-up. Tumor variables were gathered to evaluate the prognostic influence on CSS, including demographic factors (age, race and sex), the tumor characteristics (primary site, grade, tumor size, pathological type, AJCC T stage, AJCC N stage, and primary tumor resection), extrapulmonary metastasis (bone, brain, and liver metastasis), and treatments (radiotherapy and chemotherapy). Based on the above factors, CSS was introduced using the Kaplan-Meier method and comparing subgroups with log-rank tests.

Statistical analysis
To validate the reliability of the nomogram model, we randomly divided all investigated cases into training and validation cohorts in a 7:3 ratio. The training cohort was utilized to construct the nomogram model to predict the CSS of patients. Then the nomogram model was validated with the data both from the training and validation cohort. And the risk classification assessment was performed in these cohorts respectively.
To minimize selection bias, we select potential prognostic factors with P<0.01 that were analyzed by Kaplan-Meier as an additional research object. Then, these factors (excluded radiotherapy) were subjected to multivariate analysis through the Cox regression model. Based on the above factors (after subsequent selection), the nomogram was constructed to evaluate the CSS in the training cohort. Receiver operating characteristic (ROC) curves were applied to evaluate the predictive effectiveness of this nomogram for 6-, 12-, or 18month CSS. Based on the Cox model, calibration curves were drawn to evaluate the reliability of the nomogram. The patients, in evaluating calibration, were divided equally into 3 subgroups of size, and bootstrap-corrected CSS rates, according to 1000 bootstrap samples, were calculated by averaging the Kaplan-Meier estimates. Additionally, the risk classification assessment was performed using the survival package in the R language software for risk scoring. We calculated a comprehensive risk score for each sample based on individual factors from the multivariate Cox regression analysis. According to the median value of the score, the samples were divided into high-risk group and low-risk group, and CSS of the two groups was analyzed.
For independent model validation, the total points for each patient in the validation cohort were calculated in light of the produced nomogram. Then, utilizing the total points as a factor, the Cox regression in the validation cohort was performed, and therefore the ROC curve and the calibration curve were constructed according to the regression analysis. All analyses were carried out using R language (version 3.6.3) and SPSS version 26.0. It was considered statistically significant when Pvalue is less than 0.05.

Patient characteristics
Through preliminary data extraction, 1692 single EC patients with synchronous pulmonary metastases were found in the SEER database. Following the further screening, 431 Stage IV EC patients with synchronous PM were finally selected in the study cohort ( Figure 1). Then we partitioned our patients randomly into training (70%) and validation cohort (30%) by R package "caret", whose clinicopathological features were listed in Table 1. Indeed, the training cohort included 303 patients, and the validation cohort comprised 128 patients. The Chi-square test was performed on a single clinicopathological factor at both two study cohorts, and it was found that there was no statistical difference except for race factor and lymph node stage factor. Thus, the reliability of the results was assured. As shown in Table 1, training cohort patients with age≥65 (50.2%), white (81.5%), male (83.5%), abdominal or lower primary site (58.7%), high tumor grade (58.1%), tumor size≤0.5<1 cm (51.2%), adenocarcinoma (55.4%), T1 stage (32.7%), T4 stage (36.3%), and N1 (60.4%) had higher proportion. Some EC patients with synchronous PM had concurrent bone metastases (23.4%), brain metastases (6.9%), and liver metastases (39.6%). As a result, only a few patients received surgical therapy for primary tumors (3.5%). Most patients were treated with radiotherapy (45.9%) and chemotherapy (62.0%).

Identification of predictive factors by univariate and multivariate analyses
The Cox proportional-hazards model was utilized to predict CSS in the training cohort by analyzing each variable. Univariate analyses presented that some factors such as primary site, tumor grade, T stage, N stage, bone metastasis, brain metastasis, liver metastasis, radiation, and chemotherapy were related to the prognosis of patients. Among these factors, stage T (index C=0.587) and chemotherapy (index C=0.687) were discriminated against to other factors (Table 2A), which may be significant predictors. To eliminate confounding factors, we eventually selected four factors in univariate analyses with a P-value<0.01 for further multivariate analyses. Consequently, factors such as T stage, bone metastasis, liver metastasis, and chemotherapy were contained in the predictive model and considered to be independent predictors of CSS (Table 2B).

CSS analysis
At the time of the analysis, a total of 267 (88.12%) patients died of esophageal cancer within the training cohort, with a median CSS about 4 months. The 6-, 12-and 18-month CSS were 55.1%, 26.7%, and 5.9% in this cohort. Kaplan-Meier analysis was performed for each potential prognostic variable using the "survival package" from the R software.

Establishment and verification of the nomogram model
The predictive model was visualized by the nomogram (Figure 3) and verified by training and validation cohorts. After analyzing the data from the training cohort, the C-index Analytical cohort and exclusion criteria for esophageal cancer patients with synchronous pulmonary metastasis.    of the nomogram was 0.747, which means that the model has a well-distinguishing capability. Likewise, the ROC curve for the nomogram to predict the 6-, 12-, and 18-month CSS rates were presented in Figures

Risk classification assessment
To further evaluate the model, we utilized a risk classification, assessment for patients with different CSS. This system in light of each patient's total risk scores produced by final prognostic factors to split the patients into high risk (risk scores>median) group and low risk (risk scores<median) group. Then, Kaplan-Meier analysis of prognostic curves was performed in both the training cohort ( Figure 6A) and the validation cohort ( Figure 6B), which indicated that the CSS among the high-and low-risk groups was differentiated.

Discussion
The attention of EC patients with synchronous PM has increased in previous years, though these patients contributed only a few percent to EC (13). Due to the limited response to Kaplan-Meier CSS curves for several potential variables with P<0.05. (A-G) Kaplan-Meier curve of CSS based on grade, T-stage, bone metastasis, brain metastasis, liver metastasis, radiation, and chemotherapy respectively. The P-values are from a log-rank test for the comparison of the Kaplan-Meier curves. Nomogram for predicting the 6-, 12-and 18-month CSS of EC patients with synchronous PM. Nomogram used by totaling points identified at top scale for each of four independent variables (T stage, bone metastasis, liver metastasis, and chemotherapy). This summed point score then identified on total point scale to identify 6-, 12-and 18-month CSS.
local radiotherapy and chemotherapy, the prognosis of EC patients with synchronous PM is extremely unfavorable. Thus, an effective prognosis predictor is of utmost importance for the optimal management of these patients. However, the prognosis for this patient group cannot yet be properly determined by any assessment technique. Hence, the nomogram, an intuitive statistical forecasting tool, is utilized to evaluate the advanced EC patient's prognosis and CSS rate with visualization results.
Our study selected 431 cases of EC with a synchronous PM from the SEER database. Based on the results of multivariate Cox regression analyses, the variables (including T stage, liver metastasis, bone metastasis, and chemotherapy) were identified as independent prognostic factors. Next, four factors were taken into account to construct the nomogram that could accurately guide subsequent treatment according to precise predictions of CSS. Additionally, the nomogram model indicated excellent Verification of the nomogram in the training cohort. The ROC curve (A-C) and calibration curve (D-F) of the nomogram for the 6-, 12-, and 18month CSS.

B C D E F
A FIGURE 5 Verification of the nomogram in the validation cohort. The ROC curve (A-C) and calibration curve (D-F) of the nomogram for the 6-, 12-, and 18-month CSS.
consistency for predicting the 6-, 12-and 18-month CSS via the ROC curve and the calibration curve verification in EC patients with synchronous PM. Moreover, the results of the risk classification assessment showed that the high-risk group portends worse CSS than the low-risk group.
In this study, it is not difficult to see that EC patients with synchronous PM had a poor overall prognosis, similar to the preceding report (14). EC Patients with bone metastases would have a worse CSS than those without. Similarly, the same outcome happened to EC patients with liver metastasis. In particular, EC patients with liver metastases scored higher in the CSS model compared to the patients with synchronous bone metastases. It has been amply proven in multiple prior studies that chemotherapy improves the prognosis of patients with advanced EC (15)(16)(17). Patients who did not get chemotherapy had the highest score in the nomogram model, spanning the whole axis. It implies that chemotherapy could increase patients' probability of surviving.
According to the AJCC Cancer Staging Manual, 7th edition, T1 staging denotes a limited invasion of esophageal cancer that is contained to the mucosa and submucosa. In fact, this limitedstage esophageal cancer is easily overlooked by patients. Tumors that expand more slowly locally tend to have a longer time to progression and thus more accessible to blood vessels. Furthermore, the submucosa, as we know, is rich in blood vessels, and distant metastasis of esophageal cancer is achieved by hematogenous metastasis. As a result, in the population of patients with primary diagnosis of metastatic esophageal cancer, the number of T1 stage may exceed other stages. Additionally, lymphatic metastasis and hematogenous metastasis are two distinct modes of metastasis in esophageal cancer. By reviewing the relevant literatures, we found that there seems to be a subtle association between lymph node metastasis and hematogenous metastasis. For example, older melanoma patients have lower rates of sentinel lymph node metastases yet paradoxically have inferior survival. In vivo, reconstitution of HAPLN1 in aged mice increased the number of LN metastases, but reduced visceral metastases (18). Similarly, androgen receptor increases hematogenous metastasis yet decreases lymphatic metastasis of renal cell carcinoma (19). Besides, patients with larger tumors were more likely to have lymph node metastases (20). Since the majority of the patients in this study were T1 stage and had tiny initial tumors, there were few lymph node metastases. Based on these studies, we suggest that initially diagnosed esophageal cancer with distant metastasis may be more prone to develop fewer lymph node metastases (N0 and N1).
Notably, advanced EC patients with stage T1 have a noticeably worse CSS than patients with stage T2 or T3. And stage T4 has the worst prognosis as cancer of the esophagus invades the peri-esophageal tissues. This could probably be due to the symptoms of the EC of stage T1 are not obvious and the disease has entered a more serious stage when pulmonary metastases occur. Therefore, compared to stage T2 and T3, EC patients with stage T1 have significantly adverse CSS in the metastatic EC. For instance, a patient with bone metastasis (20 points), liver metastasis (30 points), T1 stage (62.5 points), and chemotherapy (0 points) result in the estimated 6-, 12-and 18month CSS of 40%, 10%, and 0% in our study. Getting a total of fifty points, the same patient with stage T2 has estimated the 6-, 12-and 18-month CSS around 75%, 50%, and 30%. In a word, the patient with stage T1 has a remarkably poor outcome, which B A FIGURE 6 Kaplan-Meier curves of CSS for patients in the low-and high-risk groups. The patients were separated into risk-subgroups according to Cox regression model. could result in an increased focus on early cancer metastases. Due to the invasion of para-esophageal tissues, stage T4 EC is related to the poorest CSS than other stages.
In addition, the result of univariate Cox analysis indicated that tumor grade (P<0.05) was an independent prognostic factor on the CSS of patients. Although tumor grade proved to be prognostic factors independent of metastatic EC (21,22), classification in the studies were either inelegant or small in the sample size. Furthermore, the pathological tissue sampling site of advanced EC patients may result in differences in tumor grade classification due to the intra-tumor heterogeneity (23, 24). Given that the completeness and applicability of the pathological samples could not be assessed, we did not include the factor of tumor grade in our predictive model. Besides, the univariate analysis also showed that radiotherapy had a significant effect on patient survival. Palliative radiotherapy can relieve the local symptom of advanced EC patients, but there is no statistically significant effect of radiotherapy on overall survival (25). Interestingly, a study of radiotherapy for overall survival and CSS in metastatic esophageal cancer suggested that esophageal squamous cell carcinoma could obtain survival benefit from radiotherapy (P<0.01), but esophageal adenocarcinoma reached the opposite conclusion (26). Indeed, the pathological type of adenocarcinoma was present in more than fifty percent of our study. Since the frequency, dosage, periods, and more details of radiotherapy cannot be acquired from the SEER database, we excluded this factor from the model. Similarly, brain metastasis (P<0.05) seemed to be an independent factor affecting patients' prognosis based on the univariate Cox regression analysis. However, only less than 7% of EC patients diagnosed with brain metastasis in our training cohort, which is a significant reason why we did not include this factor in our final model.
The finding of this study revealed that chemotherapy had a remarkable influence on the CSS of EC patients with synchronous PM. However, there was no comprehensive information on chemotherapy in the SEER database. At the same time, the lack of population data from different countries may hinder the widespread application of this predictive model.

Conclusions
A nomogram model was established to accurately assess the prognosis and the CSS of EC patients with synchronous PM. The model contained three clinical factors and a treatment factor and performed well on both training and validation cohorts. It would be utilized to estimate the individualized CSS and guide therapeutic decisions. Palliative chemotherapy presented in the model could improve the CSS of the EC patients with synchronous PM.

Data availability statement
Publicly available datasets were analyzed in this study. This data can be found here: SEER database.

Author contributions
C-LZ conceived and designed the study with X-YZ and Q-YL. X-YZ and Q-YL collected the data, drafted the manuscript, analyzed the data and formatted the images and the article. C-LZ reviewed the data. All authors read and approved the final manuscript. All authors contributed to the article and approved the submitted version.

Funding
This study was funded by Wenzhou Science and Technology Bureau (grant number Y20180220).