Risk Factors and Genetic Biomarkers of Multiple Primary Cancers in Esophageal Cancer Patients

Esophageal cancer (EC) is a deadly cancer that frequently develops multiple primary cancers (MPCs). However, the risk biomarkers of MPC in EC have hardly been investigated. We retrospectively enrolled 920 subjects with primary EC and analyzed the possible risk factors as well as MPC single-nucleotide polymorphisms (SNPs) from blood DNA. A total of 184 subjects (20.0%) were confirmed to have MPC, 59 (32.8%) had synchronous MPC, and 128 (69.6%) had head and neck cancer. Elderly EC patients have an increased risk of having gastrointestinal cancer (Odds ratio, OR[95% CI]=6.70 [1.49–30.19], p=0.013) and a reduced risk of developing HNC (OR[95% CI]=0.44 [0.24–0.81], p=0.008). MPC risk was also associated with betel nut chewing (OR[95% CI]=1.63, 1.14–2.32], p=0.008), the A allele of ALDH2:rs671 (p=0.074 and 0.030 for GA and AA, respectively), the CC genotype in CISH:rs2239751 (OR[95% CI]=1.99 [1.2–3.32], p=0.008), and the G allele of ERCC5:rs17655 (p=0.001 and 0.090 for GC and CC, respectively). ADH1B:rs1229984 also correlated with MPC risk (p=0.117). Patients carrying four risk SNPs had a 40-fold risk of MPC (OR[95% CI]=40.25 [6.77–239.50], p<0.001) and a 12.57-fold risk of developing second primary cancer after EC (OR[95% CI]=12.57 [1.14–138.8], p=0.039) compared to those without any risk SNPs. In conclusion, hereditary variations in ALDH2, CISH, ERCC5, and ADH1B have great potential in predicting the incidence of MPC in EC patients. An extensive cancer screening program during clinical follow-up would be beneficial for patients with high MPC susceptibility.


INTRODUCTION
Esophageal cancer (EC) is a deadly disease. Primary EC most often presents either as esophageal squamous cell carcinoma or adenocarcinoma (1,2). Accounting for over 90% of the disease worldwide, esophageal squamous cell carcinoma is the major cell type of primary EC and is highly correlated with environmental factors (3,4). Moreover, it is also highly correlated with unfavorable habits, such as tobacco smoking, alcohol drinking, and betel nut chewing (3,5). Compared to patients without EC, studies indicate that EC patients have a more than 12-fold risk of developing mouth/pharynx cancer (6) and about a 4-fold risk of stomach cancer (6).
Multiple primary cancers (MPCs) are defined as more than one (synchronous or metachronous) primary cancer in the same individual (7). The frequency of multiple primaries for cancer patients is reported to be in the range of 2-17% (7). A cancer patient may develop multiple primary tumors due to several epidemiological factors, such as genetics, family history, hormonal factors, prior cancer treatment, lifestyle factors, and environmental influences (7). There is a 5-10% risk of MPC with an inherited genetic mutation (8). Systematic biomarker studies for MPCs are lacking, especially in patients with EC. Here, we investigated the association between candidate single-nucleotide polymorphisms (SNPs) and MPC in EC patients. These candidates include the SNPs at the aldehyde dehydrogenase 2 family member (ALDH2), alcohol dehydrogenase 1B (ADH1B), cytochrome P450 family 1 subfamily A member 1 (CYP1A1), glutathione S-transferase pi 1 (GSTP1), cytochrome c oxidase subunit 2 (COX2, encoded by PTGS2 gene), and ERCC excision repair 2 (ERCC5).
Alcohol dehydrogenase (ADH) and aldehyde dehydrogenase (ALDH) are key NAD-dependent enzymes involved in alcohol metabolism (9). ADH1B and ALDH2 are the major enzymes that convert alcohol to acetate in humans (10). The ADH1B SNPs rs1229984 and ALDH2:rs671(G>A, Glu487Lys) are reportedly correlated with the risk of alcohol-related cancers, including hypopharyngeal cancer and EC (11)(12)(13)(14)(15)(16). Additionally, COX2: rs20417 and CYP1A1:rs1048943 are known to correlate with the risk of both oral cancer and EC (17)(18)(19). SNPs involved in nucleotide excision repair, such as ERCC5:rs17655, have also been found to be associated with the incidence of laryngeal cancer and EC (20,21). Finally, GSTP1 rs1695 is also an SNP for the risk of EC and other cancers, such as breast cancer (22,23).
In the current study, we investigated the risk factors and potential genetic biomarkers for MPC in patients with EC.

Study Population and Data Collection
This study was performed retrospectively and was approved by the ethical committee of the National Taiwan University Hospital (NTUH, 201803015RIND). A total of 920 EC patients with or without MPC were retrospectively enrolled in the National Taiwan University Hospital (NTUH) during the study period (Jan. 2000 to Dec. 2018). The inclusion criteria were patients diagnosed with primary EC. Pregnant patients, pediatric patients, and those unable to give informed consent or blood samples were excluded. Figure 1 shows the study flowchart. In total, 2825 patients with primary EC were admitted to NTUH between 2000 and 2018. Among them, 996 patients donated blood for research, and 932 DNA samples were successfully extracted from the blood samples for genotyping. Twelve patients were excluded owing to insufficient clinical data for analysis. The 920 eligible study subjects included 736 patients without MPC during their follow-up and 184 patients who were diagnosed with MPC.
Data concerning unfavorable habits, including cigarette smoking, alcohol drinking, and betel nut chewing, were collected from each patient during their clinic visit, and confirmed by nursing documentation; the information was sometimes also verified by the patient or their close family member via the telephone. Any ambiguous or vague information regarding the habits of the patients was considered as "missing data." Every user was considered to have a history of unfavorable habits. Data concerning smoking, drinking, and betel nut chewing were missing in 56, 54, and 76 subjects, respectively. A total of 844 patients were thus included in the multivariate analysis, which adjusted for age, gender, tumor site, history, and these unfavorable habits ( Figure 1).
Basic demographics, unfavorable habits, and time of primary cancer onset were obtained from the Tumor Registry of NTUH and/or medical chart review. EC treatment included surgery, chemotherapy, and radiotherapy. Neoadjuvant concurrent chemoradiation therapy was administered to patients with advanced TNM stages (T3N0 or T1-3N+) (24) diagnosed using endoscopic ultrasound or computed tomography before surgery (25,26).
The Warren and Gates criteria for second primary cancer (SPC) were used to defined MPC (27). MPCs were classified as synchronous if the different primary cancers were diagnosed simultaneously or within 6 months of primary EC diagnosis. If the interval between the date of EC diagnosis and that of another cancer was >6 months, the MPC was considered metachronous. Patients who had both synchronous and metachronous cancer were classified as having combined synchronous and metachronous cancer (28).
The organ sites of the MPCs in EC patients were listed in Table 1. The MPC type was further classified according to anatomical location. Head and neck cancer (HNC) includes oral (oral cavity and tongue), hypopharyngeal, nasopharyngeal, oropharyngeal, laryngeal, pharyngeal, tonsil, and thyroid cancers. Gastrointestinal cancer includes gastric, liver, colon, rectal, pancreatic, cecal, and gallbladder cancers. Thoracic cancer includes lung/bronchus and tracheal cancer. Other types of SPC include bladder, breast, prostate cancer, renal, bone, skin, eye cancers, and lymphoma. stored in a -80°C freezer. Genomic DNA was extracted from 200 mL of the buffy coat containing peripheral blood mononuclear cells using the QIAamp DNA mini kit (Qiagen, Hamburg Germany) according to the manufacturer's instructions. The DNA was dissolved in double-distilled water and stored in a -20°C freezer.

Statistical Analysis
All statistical analyses were conducted using SPSS 17.0 (SPSS Institute, Chicago, IL, USA). Patient characteristics and genotype distribution among the subgroups with or without MPC were compared using Pearson's chi-square test or Fisher's exact test. The odds ratios (ORs) for MPC of patients carrying risk genotypes or factors adjusted for potential covariates were analyzed using binary logistic regression. A p-value of less than 0.05 was considered statistically significant. Receiver operating characteristic (ROC) analysis was used to evaluate the diagnostic performance of the risk genotypes for MPC. The area under the curve (AUC) or the ROC curve was applied to evaluate the discriminatory capability of the risk genotypes for patients with SPC after EC. Generally, an AUC of 0.7-0.8 is considered acceptable, and 0.8-0.9 is considered excellent (30).

RESULTS
Of the 920 EC patients, 20.0% (N=184) were confirmed to have MPC (  Figure 1). Regarding the site distribution of MPC in EC patients, the site most frequently affected was the head and neck (N=128, 69.6%). In addition, 21.1% (N=39) and 10.9% (N=20) of the MPC patients were diagnosed with cancer of the digestive system and thoracic cavity, respectively. A total of 28 patients had multiple MPC with cancer develop from two to five organs. MPC incidence was significantly higher among males (20.8% vs. 10.3%, male vs. female, p=0.038, Table 2). As expected, the unfavorable habits of cigarette smoking, alcohol drinking, and betel nut chewing were all correlated with increased MPC incidence (p=0.030, 0.006, and 0.001, respectively, Table 2). Notably, patients with a history of betel chewing developed synchronous MPC more frequently than those without (39.5% vs. 24.7%, p=0.028, Table 2). We further analyzed the risk factors of individual MPC types using logistic regression adjusted for other variables. Among these unfavorable habits, betel chewing was found to be a significant risk factor for the incidence of MPC   Table 3).
We preliminarily analyzed the association between esophageal cancer MPC and 58 single nucleotide polymorphisms (SNPs) that are related to carcinogenesis. The SNPs were determined using our pre-existing database of a smaller cohort (N=500, data not shown). These SNPs included 16 SNPs involved in growth factors and receptors, 10 SNPs related to microRNA functions, eight SNPs associated with inflammation, four SNPs of the genes of the nucleotide excision repair (NER) pathway, and 20 SNPs of the genes of the suppressor of cytokine signaling (SOCS) family. The preliminary results revealed that three SNPs of the suppressor of cytokine signaling (SOCS) family of genes, including CISH: rs2239751, SOCS1:rs33932899, and SOCS1:rs243324, showed a tendency to correlate with increased MPC risk in EC. In addition to the three SNPs, we added six candidate SNPs for MPC analysis, including ADH1B:rs1229984, ALDH2:rs671 (11)(12)(13)(14)(15)(16), COX2: rs20417, CYP1A1:rs1048943 (17)(18)(19), ERCC5:rs17655 (20,21), and GSTP1:rs1695 (22,23). Furthermore, we analyzed the feasibility of using these nine candidate SNPs as biomarkers in predicting the incidence of EC and MPC.
We first compared genotype distributions of the nine candidate genes among normal populations of Taiwan and East Asia, as well as the EC subjects enrolled in this study. The genotype distributions of the Taiwan and East Asia populations were not significantly different from all SNPs ( Table 5). On the other hand, there were highly significant differences between the EC patients and the normal Taiwanese population in terms of distributions of both alcohol-related SNPs (ADH1B:rs1229984 and ALDH2:rs671, p<0.001). In ADH1B:rs1229984, the homologous variant CC was significantly more prevalent in the EC patients than in the normal Taiwanese population; by contrast, TT was significantly less common (23.5% vs. 7.4% for CC; 39.1% vs. 52.2% for TT, p<0.001, Table 5). Individuals carrying the CC genotype had a 4.30-fold increased risk for EC (CC/TT, crude OR[95% CI]=4.30 [3.32-5.58], p<0.001, Table 6).
We then further analyzed the correlation between the candidate SNPs and the risk of MPC in EC patients. An increased risk of MPC was found in patients carrying the GA and AA genotypes of ALDH2:rs671 (p=0.021). Up to 22.0% and 31.6% of GA and AA carriers, respectively, had other malignancies in comparison to only 15% of GG carriers ( Table  7). This correlation was more evident in patients who developed HNC, as more than 16% of GA carriers also had HNC ( Table 7). The heterozygous genotype GC of ERCC5:rs17655 also correlated with increased risk of MPC (p=0.005), HNC (p=0.038), and gastrointestinal cancer (p=0.048). Moreover, CISH:rs2239751 was found to be significantly correlated with MPC in EC patients (p=0.033), with more than 29% of CC carriers also developing MPC. Notably, 7.3% of CC carriers had thoracic cancers, mostly lung cancer, which was a much higher incidence than in patients with the AA or CA genotypes (1.9% and 1.0%, respectively, p=0.002,  Table 8). The GG genotype was also significantly associated with the risk of developing HNC (OR[95% CI]=1.91 [1.05-3.49], p=0.035). Moreover, ERCC5:rs17655_GC was also significantly correlated with the risk of metachronous MPC before or after EC (p=0.011 and p=0.022, respectively, Table  9). Finally, ADH1B:rs1229984_CC was associated with MPC and HNC (p=0.117 and p=0.160, respectively).
The ROC curve further revealed that the number of risk genotypes had an excellent capability for SPC in female patients (AUC=0.875, Figure 2B) but poor capability in all patients as well as in male patients (Figures 2A, C, AUC=0.616 and 0.596,  respectively). For the development of HNC after EC (SPC_HNC), the cumulating risk genotypes had a better capability in non-chewers (AUC=0.724, Figure 2E) compared to betel nut chewers (AUC=0.643, Figure 2D). Notably, the risk genotypes had an excellent capability for SPC_HNC in patients with no chewing and drinking habits (AUC=0.810, Figure 2F).

DISCUSSION
Our study is the first to systematically investigate risk factors and predictive biomarkers for MPC in EC. Based on our results, age was a risk factor for MPC. Head and neck cancers, such as oral cancer, are most prevalent between 40 and 60 years of age, whereas cancers in the digestive system, such as gastric cancer and colon cancer, are most prevalent among elderly people aged >70 years. Our results reveal that younger patients with EC have an increased risk of having HNC and that older patients have a  higher probability of developing gastrointestinal cancers ( Table  3). Moreover, our results showed that betel chewing is the most predominant unfavorable habit correlated with the incidence of MPC in EC, especially in patients with HNC ( Table 3). Both cigarette-smoking and alcohol-drinking were obviously correlated with an increased incidence of MPC according to Chi-square analysis ( Table 2), but no significant effect was observed in the multivariate regression model adjusted for other variables ( Table 3). Over 80% of these EC patients have tobacco (N=733) and alcohol (N=703) consumption, and about   35% of patients (N=299) have betel nut chewing behavior. Since all of these betel nut chewers also had at least one of the two other unfavorable habits, we suggest that the cumulative effect of these dangerous habits is crucial in the incidence of MPC in EC. ALDH2:rs671 and ADH1B:rs1229984 have been frequently demonstrated to strongly correlate with the risk of EC (11)(12)(13)(14)(15)(16). The genotype distribution of ADH1B:rs1229984 also showed no significant difference between the normal Taiwanese and whole East Asian populations (p=0.905); by contrast, there was a significant difference between the EC subjects and the normal Taiwanese population (p<0.001, Table 5). The percentage of ALDH2 deficiency in Taiwan has been ranked number 1 globally, with around 48% of Taiwanese people carrying the variant allele; however, this did not have a statistically significant difference when compared to the whole East-Asian population, according to our analysis (p=0.750). Moreover, the genotype distribution of rs671 was significantly different between our EC subjects and the normal Taiwanese population (p<0.001). Up to approximately 70% of EC subjects carry the GA variant. We further demonstrated that GA carriers had an increased risk of developing HNC ( Table 8) and synchronous MPC ( Table 9). EC patients carrying the null variant AA also had a significant risk for MPC, especially for synchronous MPC ( Table 9). Although alcohol is generally considered to be metabolized in the liver, some studies provide evidence to support the hypothesis that the exposure of alcohol-derived acetaldehyde may occur in the oral cavity since high salivary acetaldehyde was found in ALDH2deficient subjects after drinking alcohol (32,33). The protective role of ALDH2 against DNA damage induced by acetaldehyde in the esophageal squamous epithelium has also been reported (34). Whether the genetic effect of ALDH2_rs671 on the development of EC and HNC is mediated by regulating the local carcinogen action of acetaldehyde needs to be clarified by further research.
ERCC5, a single-stranded structure-specific DNA endonuclease, plays an essential role in the nucleotide excision repair machinery. rs17655 is a non-synonymous SNP in the coding region of ERCC5 and causes a 1104 amino acid change from Asp to His (Asp1104His). In our results, rs17655 was not associated with the risk of EC (Table 5). However, heterozygote GC carriers had a significantly increased risk for developing  HNC and metachronous MPC (Tables 8 and 9). A previous study revealed that the rs17655 heterozygote carriers exhibited an increased risk of laryngeal cancer among heavy smokers (35). Thus, the function of rs17655 in MPC of EC patients is possibly due to its impaired repair function in response to environmental toxins, which leads to the development of HNC. We found the novel biomarker CISH:rs2239751 to be significantly associated with MPC in EC patients, especially in combination with other thoracic cancers, particularly lung cancer (Tables 7 and 8). CISH belongs to the family of SOCS proteins, one of the key mechanisms regulating signaling derived from cytokines and growth factors, and plays important antiinflammatory and tumor-suppressive roles (36). Degradation of receptors or associated proteins is one of the mechanisms by which SOCS proteins negatively regulate cytokine signaling or growth factors. CISH has been known to negatively regulate pathways induced by GH, IL-2, IL-3, IL-5, GM-CSF, EPO, and PRL (36). CISH:rs2239751 is a 5'UTR variant in transcript variant-1, which is reportedly correlated with persistent HBV infection (37). The minor allele C has also been found to be associated with susceptibility to tuberculosis in the Chinese Han population (38). The minor allele frequency of rs2239751 among the global population is only about 0.0914 (https://doi.org/10. 1101/531210) This frequency also dramatically increases in the East Asia population to about 0.3356, which is close to the minor allele frequency in our population of EC patients at 0.3330. We also found that patients carrying CC had >5 odds of also having lung cancer ( Table 9). Whether CISH:rs2239751 is also correlated with the incidence of lung cancer is worthy of future investigation.
We analyzed the cumulative effect of these MPC risk genotypes and revealed that patients carrying all 4 risk genotypes had over 40-fold and 12-fold increased risks of having MPC and SPC, respectively ( Table 10). Although only 1.4% (13 out of 920) of the EC subjects carried 4 risk genotypes, it is a considerable number among cases of esophageal cancer globally (over 500,000/per year, new cases). Furthermore, the ROC curve analysis revealed that the risk genotype had an excellent capability for SPC in the low-risk population, including female patients (AUC=0.875) and those without drinking and chewing habits (AUC=0.810, Figure 2). It is reasonable that the genetic effects were more evident in patients without exposure to unfavorable lifestyle factors since these habit-related human carcinogens greatly impact cancer development and, therefore, probably masked the genetic effects for SPC.
Taken together, the study demonstrated for the first time that a set of risk SNPs, ALDH2:rs671, CISH:rs2239751, ERCC5:rs17655, and ADH1B:rs1229984, have great potential in predicting the incidence of MPC in EC. Genetic testing for these SNP variants would be beneficial for the early diagnosis of SPC. The limitations of the study were as follows: 1) there was no validation cohort, and 2) the lack of clear information to separate ever users and current users based on the use of tobacco, alcohol, and betel nut accurately.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: European Nucleotide Archive, PRJEB41367.