Frequency and Prognosis of Pulmonary Metastases in Newly Diagnosed Gastric Cancer

Purpose: The purpose of this study was to analyze the frequency and prognosis of pulmonary metastases in newly diagnosed gastric cancer using population-based data from SEER. Methods: Patients with gastric cancer and pulmonary metastases (GCPM) at the time of diagnosis in advanced gastric cancer were identified using the Surveillance, Epidemiology and End Result (SEER) database of the National Cancer Institute from 2010 to 2014. Multivariable logistic regression was performed to identify predictors of the presence of GCPM at diagnosis. Receiver operator characteristics analysis was performed to significant predictors on multivariable logistic regression and was then assessed with Delong's test. Multivariable Cox regression was developed to identify factors associated with all-cause mortality and gastric cancer-specific mortality. Survival curves were obtained according to the Kaplan-Meier method and compared using the log-rank test. Results: We identified 1,104 patients with gastric cancer and pulmonary metastases at the time of diagnosis, representing 6.02% of the entire cohort and 15.19% of the subset with metastatic disease to any distant site. Among the entire cohort, multivariable logistic regression identified six factors (younger, upper 1/3 of stomach, intestinal-type, T4 staging, N1 staging, and presence of more extrapulmonary metastases to liver, bone, and brain) as positive predictors of the presence of pulmonary metastases at diagnosis. The value of AUC for the multivariable logistic regression model was 0.775. Median survival among the entire cohort with GCPM was 3.0 months (interquartile range: 1.0–9.0 mo). Multivariable Cox model in SEER cohort confirmed five factors (diagnosis at previous period, black race, adverse pathology grade, absence of chemotherapy, and presence of more extrapulmonary metastases to liver, bone, and brain) as negative predictors for overall survival. Conclusions: The findings of this study provided population-based estimates of the frequency and prognosis for GCPM at time of diagnosis. The multivariable logistic regression model had an acceptable performance to predict the presence of PM. These findings may provide preventive guidelines for the screening and treatment of PM in GC patients. Patients with high risk factors should be paid more attention before and after diagnosis.


INTRODUCTION
Gastric Cancer (GC) was the fourth most common malignant tumor in the world and the fifteenth in the United States (1,2). Although the reported incidence and mortality rates had steadily decreased over the last decade, there was still an estimated 26 240 new GC patients and 10 800 deaths in United States in 2018 (2). Furthermore, about 40% of patients were presented with evidence of distant metastases (3)(4)(5). The most common site of distant metastases was the peritoneum, followed by the liver, lung, and bone (5). Pulmonary metastases (PM) were really rare discovery, which had been reported in only 0.5-16% of the GC patients with distant metastases in clinical practice (6)(7)(8), but 22-52% of patients at postmortem examination (9,10). However, all these patients were unselected, including synchronous and asynchronous metastatic patients. PM was associated with poor survival in patients with advanced gastric cancer. The 5-year survival of gastric cancer and pulmonary metastases (GCPM) was only 2-4% (6,7,11). And the median survival time was 4 months for both newly diagnosed PM and those asynchronous patients (6,7,12).
An early detection of pulmonary metastases was necessary to alter patient management and result in significant cost savings and medical resources savings by reducing unnecessary surgery or other treatments. Chest CT was not recommended routine assessment in current gastric cancer screening guidelines. However, multiple studies revealed that CT was more superior in identifying some metastatic nodules than plain chest radiography and conventional liner tomography (CLT) (13)(14)(15). And the conventional chest radiograph was always adopted at the initial screening examination in clinical practice, which may lead to missed diagnosis. Thus, a population-based study including a large sample was particularly important to determine which patients need to receive further examination.
There were only limited data regarding pulmonary metastases from gastric cancer at present, and the majority of objects included in these researches were asynchronous metastatic patients (6-8, 11, 16, 17). The study in newly diagnosed gastric cancer with pulmonary metastases was lacking, so the proportion, predictive factors, prognostic factors, and optimal strategy for these patients were unknown. Therefore, a study based on population level about GCPM to describe epidemiologic characteristics and prognosis was urgently needed.
The purpose of this study was to use data from the Surveillance, Epidemiology and End Results (SEER) database between 2010 and 2014 to survey the incidence proportion and predictive factors of pulmonary metastases at the time of cancer diagnosis among patients with gastric cancer on a populationbased level. We also wanted to characterize prognostic factors on the survival of patients at diagnosis of gastric cancer with pulmonary metastases.

Study Population
Data was obtained from the SEER database, which was the largest publicly available cancer dataset and collected cancer data from 18 population-based cancer registries covering about 28 percent of the United States population (18). This database included information about cancer incidence as well as demographic information: age, gender, race, year at diagnosis, tumor staging, tumor size, treatment, marital status, insurance, education, family income, and so on. We used the SEERStat software version 8.3.4 published by SEER to identify eligible patients in this study, which we could get from the official network (https://seer.cancer.gov/). The SEERStat provided patients information up to 2014 based on the November 2016 submission, and it started to release metastatic information related to pulmonary metastases from 2010. Thus, we can get information about GCPM between 1 January 2010 and 31 December 2014 from SEERStat. Besides, pulmonary metastases included only the lung, but not pleura or pleural fluid in the SEER database.
Within the SEER database, we identified 36,982 patients with gastric cancer from 2010 to 2014. Patients with other cancers, <18 or more than 85 years old, with other pathological types were excluded from the analysis, leaving 18,331 patients in the final cohort for frequency analysis. Of these, 7268 patients were diagnosed with metastases to any distant site and 1,104 patients were diagnosed as GCPM. We subsequently removed patients with an unknown follow-up, leaving 1,098 patients eligible for survival analyses. The percentage of distant metastases to any site was 39.65% and pulmonary metastases were 6.02%. Data extraction flowchart was showed in Figure 1. The inclusion criteria were as follows: age more than 18 years old and <85 years old at time of diagnosis; gastric cancer as the only one primary cancer; with identified pulmonary metastases; confirmation of diagnosis based on pathology of a specimen, rather than based on radiography or laboratory; with active follow-up. And we excluded those patients conformed to any one of the following standards: age <18 years old or more than 85 years old at the time of diagnosis; with more than one primary Frontiers in Oncology | www.frontiersin.org cancer; unknown pulmonary metastases; cancer diagnosed by radiography or laboratory; pathological type confirmed to be NET stomach, sarcoma, GIST or lymphoma; without active follow-up. 12/31/2014 was the cut-off date in this study. More details can get from SEERStat software version 8.3.4 and SEER manual 2016. The end point of this study was OS. The OS was defined from the date of diagnosis to the date of all-cause death or cancer-specific death, and patients survived to the latest follow-up identified as censoring. Toward the last follow up, there were 925 deaths and 173 censoring among patients with GCPM.

Statistical Analysis
Descriptive statistics was used to calculate the absolute number and frequency among patients with PM at the time of cancer diagnosis. Frequency was defined as the percentage of gastric cancer patients diagnosed with PM among the entire study cohort and the patients with metastatic disease to any distant site. All data were stratified by year at diagnosis, age, gender, race, original, primary site, pathology grade, Lauren classification, T staging, N staging, tumor size, treatment, number of extrapulmonary metastatic sites and other sociodemographic information, such as: marital status, residence type, insurance situation, bachelor education, median household income, and smoking status. Residence type, education level, median household income, and smoking status were defined by the county attributes from the US Census 2010-2014 American Community Survey 5-year data files, which we could get from the SEER * Stat software.
Chi-square or Fisher's test was developed for clinical characteristics of GCPM patients at the exclusion of those with unknown information. Multivariable logistic regression was used to determine predictors of the presence of pulmonary metastases at diagnosis. And only variables which demonstrated significance on both the Chi-square test and the univariate logistic regression can enter into the multivariable logistic regression. This was a population-based study, so we focused more on the entire cohorts (GC) but not subcohort (GC with metastatic disease to any distant site). Survival estimates were obtained according to the Kaplan-Meier method and compared using the log-rank test. Variables that reached significance with P < 0.05 were entered into the multivariable analyses using the Cox regression model to identify covariates associated with increased all-cause mortality. Besides, we used Fine and Gray's competing risk regression to assess gastric cancer-specific mortality (19). Binary-dependent receiver operator characteristics (ROC) analysis for different variables to predict the presence of PM was developed. And Delong's test was conducted to further expound the performance of multivariable logistic regression model.
All statistical analyses were performed using SPSS statistical software (version 18.0). The competing risks analysis was performed using the cmprsk package (version 2.2-7) and ROC was developed using the pROC package (version 3.2-5) in R (version 3.4.4; R Foundation). Delong's test was performed using Medcalc software. Statistical significance was set at two-sided (P < 0.05).

Frequency Analysis
A total of 18,331 patients in the U.S. were diagnosed with gastric cancer between 2010 and 2014, including 1,104 patients diagnosed with GCPM whose median age was 66 years old, consisted of 773 men (70.02%) and 331 women (29.98%). Their demographic and clinical characteristics were shown in Table 1. On Chi-square or Fisher's test, a significant difference was found in age, gender, race, primary site, Lauren classification, T staging, N staging, tumor size, number of extrapulmonary metastatic sites, radiotherapy, surgery, insurance situation and median household income. Rate of chemotherapy showed no significant difference between PM group and no-PM group. More detail information can be found in Table S1.
On univariable logistic regression (Table S2) among the entire cohort, there were nine factors that showed significance (P value < 0.05). They were age, gender, primary site, Lauren classification, T staging, N staging, tumor size, number of extrapulmonary metastatic sites to liver, bone, and brain and insurance situation. We put them on multivariable logistic regression which showed that age, primary site, Lauren classification, T staging, N staging, and number of extrapulmonary metastatic sites to liver, bone, and brain had significance among the entire cohort and primary site, Lauren classification, N staging, tumor size and number of extrapulmonary metastatic sites to liver, bone, and brain had significance among the subset with metastatic disease to any distant site.
On the multivariable logistic regression ( Frontiers in Oncology | www.frontiersin.org   Table 2. In order to further expound the performance of multivariable logistic regression model, binary-dependent ROC analysis was performed for the model and different variables. The model was a combination of six significant variables (age at diagnosis, Lauren classification, primary site, T staging, N staging, and extent of extrapulmonary metastastic disease) on multivariable logistic regression. Delong's test was developed to verify the performance. The value of AUC of the model  Table S4. And the ROC curves for the entire cohort and subcohort were in Figures S1, S2.

Survival Analysis
Among the subset with pulmonary metastases, there were five factors that were significantly associated with overall survival on multivariable Cox regression model. Table S3 showed univariate analysis for all-cause mortality and gastric cancer-specific mortality among GCPM. On multivariable Cox regression ( Table 3)  2010; HR, 0.74; 95%CI, 0.59-0.92; P = 0.01) was significantly associated with a decreased all-cause mortality. And absence of surgery (vs. surgery; HR, 1.62; 95%CI, 1.13-2.33; P = 0.01) were significantly associated with an increased gastric cancer-specific mortality only. Gastric cancer-specific mortality among patients with GCPM at diagnosis was also presented in Table 3. Survival estimates of overall (Figure 2A) and as stratified by year at diagnosis (Figure 2B), race (Figure 2C), pathology grade (Figure 2D), extent of extrapulmonary metastastic disease (Figure 2E), and chemotherapy ( Figure 2F) were graphically displayed in the Figure 2.

DISCUSSION
This study analyzed the frequency and survival of gastric cancer patients with pulmonary metastases at their initial diagnosis using data from the SEER database. We also characterized the predictive factors and prognostic factors in an attempt to better   understand the clinical impact of pulmonary metastases. To the best of our knowledge, this was the largest study including 1,104 patients with GCPM at present. Previously published data had evaluated the incidence proportions and prognosis of GCPM roughly, and the frequency of pulmonary metastases from gastric cancer had yielded varying results, rang from 0.5 to 16% in current clinical practice (6,7). However, the frequency of pulmonary metastases was found to be 22-52% at postmortem examination (9, 10). Most studies above were small samples from a single institution, which was unconvincing (6)(7)(8)(9)(10). Therefore a study based on population level to describe the frequency and prognosis of patients who presented with de novo pulmonary metastases was urgently needed. In this large retrospective study, we found that 6.02% of patients with gastric cancer had pulmonary metastases at diagnosis, and 15.19% of those with any metastases at diagnosis had pulmonary metastases. This result was a little different from that of previous published studies (6-10), but was in accordance with that of a previous study (12) using SEER database, which showed 5.92% of PM in all patients and 14.45% of PM in metastatic disease. Part of asymptomatic patients with lung metastases could not be found at initial diagnosis due to lack of accurate evaluation. On the other hand, most of the patients in previous studies developed pulmonary metastases in their disease course after a diagnosis of early-stage gastric cancer, so these researches contained both synchronous and asynchronous metastatic patients. And our work only focused on patients with metastatic gastric cancer at initial diagnosis, so the frequency of PM may be underestimated.
Risk factors for the presence of PM at GC diagnosis were determined using multivariate logistic regression. We found that patients had significantly greater odds of having pulmonary metastases at diagnosis when they showed the six factors as follow: younger, upper 1/3 of stomach, intestinal-type, T4 staging, N1 staging, and presence of more extrapulmonary metastases to liver, bone, and brain. Younger patients were always accompanied with more aggressive tumors which led to the common appearance of pulmonary metastases, as we guessed. An USA study by Smith found that 81% of young patients developed distant metastases compared to 50% in the elder for 15-year follow up which believed that earlier diagnosis and effective treatments were urgently needed to decrease the extreme lethality in these young patients (20). The presence of intestinaltype seemed to be associated with pulmonary metastases in this study. Huachuan et al. guessed that it might attribute to high expressions of extracellular matrix metalloproteinase inducer (EMMPRIN), which promoted tumor growth and metastasis (21). Primary tumor located at the upper 1/3 of stomach had significantly higher percentage of pulmonary metastases could be attributed to "seed-and-soil" hypothesis ("seed-andsoil" hypothesis implies organ specific tropism of circulating tumor cells) (22). Patients with T4 staging and N1 staging were easier to diagnose with pulmonary metastases, too. The finding was only seen in N1 staging because of lack of patients with N2 staging (N = 37) and N3 staging (N = 45) we guessed. And most N staging of this study was based on clinical staging which may not be accurate enough (23)(24)(25). Moreover, only T4 staging had a higher proportion of lung metastases compared with T1 staging. We thought that the same reasons existed in the variable of T staging. As we know, TNM staging was visibly associated with survival in GC. Thus, we inferred that later T staging and N staging may be associated with poor prognosis in GCPM. However, these results should be confirmed with further studies carefully. Besides, patients presented with more extrapulmonary metastatic sites were associated with a higher proportion of lung metastases. A similar result was also indicated in breast cancer (26). To say the least, our study indicated that GC patients with high risk factors above need further examination at first diagnosis, like chest CT, or PET-CT. However, it was unclear whether early detection could contribute to a more favorable survival significantly.
The multivariate logistic regression model including six significant variables had the best predictive value, with an AUC value of 0.775. And the AUC value of single predictors ranged from 0.529 to 0.745. From them, a large extent of extrapulmonary metastases hold a maximum AUC value of 0.745, and age had a minimum AUC value of 0.529. These predictors with AUC smaller than 0.6 were best to further evaluate. However, the model contains six significant variables that had an acceptable performance to predict the presence of PM in our study, which had not been reported yet.
Prognostic factors of PM at GC diagnosis were analyzed using the multivariate Cox model. We found that patients had a significantly higher risk of mortality when they showed the five factors as follows: diagnosis at previous period, black race, adverse pathology grade, absence of chemotherapy and presence of more extrapulmonary metastases to liver, bone and brain. The prognosis was better for those patients diagnosed at a later period, which may owe to those patients who receive more effective treatment with the improvement of medical conditions in recent years (2). It was worth noting that black patients had worse overall median survival which may be related to genetics and economic conditions which had not been wellexplained in previous literature. And GCPM patients with adverse pathological grade and more metastatic sites predicted significantly poor survival in this study. This result had not been well-reported by published studies to the best of our knowledge. The median OS was 3.0 months from initial diagnosis of GCPM in the SEER, which was similar to the previous study (12). Chemotherapy was considered the basic treatment for advanced gastric cancer at present. The median OS of patients with and without chemotherapy was 6 and 1 months, separately, in this study. We can find a significant increase in the hazard ratio for all-cause mortality (2.87-3.84; P < 0.001) and gastric cancerspecific mortality (2.16-2.91; P < 0.001) among absence of chemotherapy vs. presence of chemotherapy. However, the role of surgery in GCPM had not been effectively identified yet. Only a few studies and case reports (8,10,11,16,17) proposed that radical surgery may improve quality of life and survival in highly selected cases with isolated pulmonary metastases, while others hold a different sound (27,28). And our study found that surgery showed significant benefit in gastric cancerspecific mortality analysis only. The hazard ratio (1.13-2.33; P = 0.01) had a significant increase from absence of surgery to presence of surgery on a competitive risk model, while showed no significance on all-cause mortality analysis. What's more, the median OS had no significant increase from absence of surgery group (3 months) to surgery group (4 months), which may have had four reasons as follows. Firstly, most patients in published studies were confirmed pulmonary metastases after a diagnosis of early-stage gastric cancer and received metastasectomy later (6)(7)(8)(9)(10). Secondly, those patients in published studies were highly selected with excellent surgical conditions. Thirdly, samples in previous reports were really small with 12 patients as the largest sample (8). Finally, the GCPM patients with surgical resection were only 51 in this study, among them forty-four patients received gastrectomy and only 7 patients received radical surgery whose median survival was 6.0months (IQR:1-27mo), which needs further investigation with more patients and convincing research methods. A prospective randomized controlled trial (RCT) was not easy to conduct for patients with GCPM due to their complex characteristics, so the road may be hard and long. Besides, radiotherapy showed no significance for overall survival on multivariate Cox model in this study. In summary, chemotherapy may be the basic treatment for GCPM at present, while surgery may be available for those highly selected patients with caution. And we did not recommend routine surgery and radiotherapy at present.
Although our study was based on population-level, containing a large number of cases, we should not ignore its limitations.
Firstly, this study was a retrospective study. We could know patients with metastatic disease to bone, liver, lung and brain, but the SEER database did not provide information about other metastatic sites, like peritoneal metastases. Moreover, we only had information on synchronous metastasis to lung, lack of a relative minority compared to those patients who may develop asynchronous metastases. Secondly, information relating to comorbidities, performance status was not available in the SEER database. Thirdly, residence type, education level, and median household income were defined at a county level, not a patient level, possibly affecting the results of the logistic and Cox regressions. Fourthly, more detail information about radiotherapy, surgery and chemotherapy were not reported in the SEER database. Finally, the SEER did not record the details of pulmonary metastases.
To the best of our knowledge, this study was the first population-based analysis of patients with pulmonary metastases at initial diagnosis of gastric cancer. It provided important suggestions for clinicians to consider designing studies that evaluate the utility of screening among patients with higher risk of pulmonary metastases. The prognostic factors on GCPM were analyzed in this study too. Besides, we described the significance of different treatment on GCPM, which might provide some help to clinical practice.

CONCLUSIONS
In summary, the findings of this study based on a population level provided estimates of the frequency for GCPM at time of diagnosis. Patients present with younger, upper 1/3 of stomach, intestinal-type, T4 staging, N1 staging, and presence of more extrapulmonary metastases to liver, bone, and brain had significantly greater odds of having pulmonary metastases at diagnosis. A series of risk factors for PM in GC patients were identified, which can indicate routine screening in such patients. Furthermore, a list of prognostic factors for GCPM patients by survival estimates was found. GCPM patients present with black race, diagnosis at previous period, adverse pathology grade, presence with more extrapulmonary metastases to liver, bone and brain and absence of chemotherapy had a significantly higher risk of mortality. These finding can signify the need for individualized treatment for these patients. Chemotherapy may be the basic treatment for GCPM at present, while surgery may be available for those highly selected patients with caution. And we do not recommend routine surgery and radiotherapy at present.

DATA AVAILABILITY
Publicly available datasets were analyzed in this study. This data can be found here: https://seer.cancer.gov/data/.

ETHICS STATEMENT
The SEER was public-use data: informed consent was waived. And our study was deemed exempt from institutional review board approval by NanFang Hospital, Southern Medical University.