Identification of the Germline Mutation Profile in Esophageal Squamous Cell Carcinoma by Whole Exome Sequencing

Background: Esophageal squamous cell carcinoma (ESCC) is associated with poor prognosis and occurs with high frequency in China. The germline mutation profile in ESCC remains unclear, and therefore, the discovery of oncogenic alterations in ESCC is urgently needed. This study investigates the germline mutation profile and reveals associations among genotype-environment interactions in ESCC. Methods: Whole exome sequencing and follow-up analysis were performed in 77 matched tumor-normal ESCC specimens to examine the germline profiles. Additionally, associations among genotype-environment interactions were investigated. Results: We identified 84 pathogenic/likely pathogenic mutations and 51 rare variants of uncertain significance (VUS). Twenty VUS with InterVar evidence of a score of moderate pathogenicity (PM) 2/PM2+ supporting pathogenicity (PP) 1 were found to have pathogenic significance. CYP21A2 was the most frequently mutated gene, and the p.Gln319* variant was identified in 6.5% (5/77) of patients. The TP53 p.V197E mutation, located within the DNA binding domain, was found in 1.3% (1/77) of patients. In total, the 11.7% (9/77) of individuals with homologous recombination (HR) VUS were more likely to have well-differentiated tumors than those without (P = 0.003). The degree of lymph node metastasis was correlated with homologous recombination deficiency (HRD) and VUS group (P < 0.05). Moreover, the 10.4% (8/77) of individuals with mismatch repair (MMR) VUS had a higher tumor mutational burden (TMB), although the correlation was not significant. Conclusions: Our study identified the germline mutation profiles in ESCC, providing novel insights into the molecular pathogenesis of this disease. Our results may also serve as a useful resource for the exploration of the underlying mechanism of ESCC and may provide information for the prevention, diagnosis and risk management of ESCC.


INTRODUCTION
Esophageal cancer (EC) has a poor prognosis and is the sixth leading cause of cancer-related death and the eighth most common cancer worldwide (Domper Arnal et al., 2015). Esophageal squamous cell carcinoma (ESCC) is the major histological type of esophageal cancer, and more than half of all esophageal cancer cases worldwide occur in China (Chen et al., 2016). Despite the tremendous efforts toward and progression of surgical techniques and other treatment approaches in ESCC, recurrence after esophagectomy or definitive chemoradiotherapy is still high (Enzinger and Mayer, 2003). Currently, understanding the molecular mechanisms that underlie tumourigenesis is necessary to identify additional diagnostic markers that contribute to ESCC. The discovery of drugs that target specific oncogenic alterations for the treatment of ESCC is urgent.
Genomic studies have investigated the landscape of ESCC based on whole genome sequencing (WGS) and whole exome sequencing (WES) (Lin et al., 2014;Song et al., 2014). Genomic alterations, especially somatic mutations, were more distinct based on the sequencing data. In our previous study, we compared genomic alterations between Asian and Caucasian patient populations (Deng et al., 2017). We identified ESCCassociated genes (TP53, NOTCH1, and PIK3CA) widely reported in previous studies. In Asian patients, EP300 and NFE2l2 were found to have higher mutation frequencies, while CSMD3 was associated with better prognosis. The findings revealed the molecular basis underlying the striking racial disparities in ESCC. However, to our knowledge, the germline susceptibility gene mutation profile in ESCC remains unclear. In this study, we performed WES on 77 ESCC specimens and adjacent normal tissues (ANT) to identify the germline mutation signature in Chinese ESCC subjects. This study not only reports the germline profiles in ESCC but also reveals associations among genotypeenvironment interactions.

Sample Acquisition
Human primary ESCC tissues and corresponding adjacent nontumor tissues (located 5 cm from the tumors) were collected from patients diagnosed with ESCC who underwent surgery as a primary treatment at Fudan University Cancer Center (Shanghai, China) between September 2007 and June 2011. Tumor tissues were snap frozen in liquid nitrogen immediately after surgical resection and stored at −80 • C until further analysis. The clinicopathological features of the patients, including age, sex, smoking history, alcohol use history, tumor site, family  (Deng et al., 2017).
Frontiers in Genetics | www.frontiersin.org history, differentiation, and tumor/node/metastasis (TNM) stage, were collected from inpatient medical records. The pathological features were evaluated independently by pathologists according to the TNM staging system of the American Joint Committee on Cancer (AJCC 7th edition). The study protocol was approved by the Ethics Committee of Fudan University Cancer Center. Samples were collected from a tissue bank, which obtained written informed consent from the participants.

Whole Exome Sequencing
Genomic DNA was extracted from tissue specimens using a QIAamp DNA kit (Qiagen), and libraries were then prepared using protocols recommended by Illumina. Briefly, 1 µg of DNA was sheared into short fragments (200∼300 bp) using a Covaris S220 ultrasonicator. The DNA fragments were then end repaired to generate adenylated 3 ′ ends. Adaptors with barcode sequences were then ligated to both ends of the fragments, and E-Gels were used to select DNA fragments of the targeted size. Next, 10 PCR cycles were performed, and the resulting product was purified. Whole exome capture was performed using a TruSeq Exome Enrichment kit (Illumina) according to the manufacturer's protocol with slight modifications. After the Illumina sequencing libraries were amplified with 10 PCR cycles, capture probes were added, and the reaction mixtures were incubated at 65 • C for 24 h. The hybridized mixtures were then amplified with an additional 10 PCR cycles, and the validated DNA libraries were sequenced on an Illumina sequencing system (Illumina HiSeq 2500).

Data Processing and Mutation Calling
Read pairs (FASTQ data) generated from the sequencing system were first trimmed and filtered using Trimmomatic (Bolger et al., 2014). The resulting reads were aligned to the hg19 reference genome using Burrows-Wheeler Aligner (BWA) (Li and Durbin, 2009). Variants were called using a variant calling pipeline based on SAMtools mpileup output . Germline mutations were called in both normal and cancer samples. Variant annotation was performed using ANNOVAR (Wang et al., 2010). SIFT was applied to annotate and predict the functions of the mutation variants (Ng and Henikoff, 2001). Genes with q values (false discovery rate, FDR) of ≤0.05 were considered significantly mutated. We considered non-synonymous SNPs for the calculation of the tumor mutational burden (TMB), which was defined as the number of mutations/Mb of sequenced DNA.

Clinical Pathological and Genetic Data
The homologous recombination (HR) repair pathway and the DNA mismatch repair (MMR) pathway are common signaling pathways explored in germline mutation analysis. The list of analyzed genes, including common and important genes related to germline mutations in various cancers, was shared by the AstraZeneca medical department. The HR and MMR pathways play vital roles in the development of a variety of cancers. We integrated all variants of uncertain significance (VUS) from the 77 ESCC cases and determined alterations in the homologous recombination arm of the DNA repair pathway (GO_RECOMBINATIONAL_REPAIR) and the DNA mismatch repair pathway (GO_MISMATCH_REPAIR).
To assess the correlation between clinicopathological features and germline alterations, we conducted univariate analysis, and correlations were determined using the Fisher, Wilcoxon and Pearson tests according to the different variable types. Then, significantly different factors were included in a multivariate analysis to strengthen the findings. Germline alterations in the patients included VUS in HR pathway genes, VUS in MMR pathway genes, rare VUS in SLX4, rare VUS in TSC2, pathogenic mutations in CYP21A2, the rare VUS carrier group and the pathogenic mutation carrier group. The workflow of our study is illustrated in Figure 1.

Population Characteristics
The clinical characteristics of the patients are summarized in Table 1 [updated from our previous study (Deng et al., 2017)]. In total, 65 male patients and 12 female patients were included in our study. Of these patients, 46 (59.7%, 46/77) had a history of smoking, 36 (46.8%, 36/77) had a history of alcohol abuse, and 12 (15.6%, 12/77) had a family history of ESCC. The median follow-up time was 35.5 months. All 77 patients were diagnosed with stage T2 disease, 4 had well-differentiated tumors, 45 had moderately differentiated tumors, and 28 presented with poorly differentiated tumors. Sixty-two patients were classified as TNM stage II, and 15 were classified as TNM stage III.
Rare VUS in SLX4 were the most prevalent, accounting for 7.8% (6/77) of missense mutation carriers. TSC2 was the second most commonly mutated gene, with a 6.5% (5/77) carrier rate including 3 missense mutations and 2 splice site variants. Two patients in the study cohort were carriers of the MSH3 p.A57_A58delins mutation supporting pathogenicity (PP), which was correlated with hereditary susceptibility to ESCC. Table 3 shows the 20 VUS with InterVar evidence of a score of moderate pathogenicity (PM) 2/PM2+PP1, which were likely to have pathogenic significance. Among these mutations, the TP53 missense mutation p.V197E, which is located in the DNA binding domain, was deleterious and demonstrated a probable contribution to ESCC progression ( Table 4).

Correlation of Clinical Features and Germline Mutations
To characterize the clinical features upon cancer diagnosis in the germline mutation carriers, we compared the clinical characteristics with the different mutation statuses. Among the 77 patients with ESCC, the patient characteristics of   (Figure 4). In total, 11.7% (9/77) of the individuals were carriers of HRD VUS, and those patients were more likely to have welldifferentiated tumors than moderately or poorly differentiated tumors (P = 0.003, Table 5 and Figure 5). A correlation existed between pathogenic/likely pathogenic mutations and cigarette consumption (Figure 6). Other cancer characteristics did not show statistically significant associations with germline mutation status. Moreover, the 10.4% (8/77) of individuals who were carriers of MMR VUS had a higher tumor mutational burden (TMB) than those who were not (Supplementary Table 2 and Supplementary Figure 2), although the non-significance of the difference may be due to the limited number of samples. Among those patients, the two who were carriers of MSH3 p.A57_A58delinsPP had a much higher TMB (5.82 and 4.47), and one patient who was a carrier of PMS2 p.S587N and p.V302F had a TMB of 6.47. Multivariate analyses showed that the degree of lymph node metastasis was correlated with HRD and VUS group (P = 0.03 and 0.04, respectively). A greater degree of lymph node metastasis occurred in patients with HRD and in those with VUS.

DISCUSSION
The current study describes an in-depth exome analysis of the germline mutational landscape of ESCC. We identified 84 pathogenic/likely pathogenic mutations and 51 rare VUS in our study cohort. We also found 20 VUS with InterVar evidence of a score of PM2/PM2+PP1, which were likely to have pathogenic significance. Our study provided germline mutation profiles in ESCC, which may serve as a useful resource for the exploration of the mechanisms underlying ESCC. TP53 is a tumor suppressor that plays a key role in the cellular stress response (Vogelstein et al., 2000). TP53 has been detected in cancer biopsies by virtue of its high protein expression level, which is considered indicative of mutation (Petitjean et al., 2007;Vijayakumaran et al., 2015). In our study, the TP53 p.V197E mutation is located within the DNA binding domain and may contribute to ESCC progression, an observation that accounts for the improper DNA binding and disruption FIGURE 4 | Correlation between clinical risk factors and germline mutation status. The patient characteristics of gender, age, age group, smoking history, drinking history, family history of ESCC, lesion number, T stage, N stage, differentiation, TNM stage, perineural invasion, LN positivity, and growth of new vessels were compared with the germline mutation status, which comprised the HRD VUS status, MMR VUS status, SLX4 rare VUS status, TSC2 rare VUS status, CYP21A2 pathogenic mutation status, VUS group, and P group. of transcriptional activity that was observed. The TP53 V197E mutant was also found to bind to PBF and promote colorectal tumorigenesis (Read et al., 2016).
In the current analysis, a correlation between cigarette use and germline mutations existed. Individuals carrying germline mutations who smoke cigarettes may be a high-risk population for esophageal cancer (Supplementary Figure 3). Cigarette smoking has been identified as an important pathogenic factor for ESCC (Domper Arnal et al., 2015). Polycyclic aromatic hydrocarbons and aromatic amines are major classes of carcinogens present in tobacco (Bartsch et al., 2000). These molecules are converted into DNA-reactive metabolites by cytochrome P450 (CYP)-related enzymes (Wu et al., 2002). CYP21A2 encodes a member of the cytochrome P450 enzyme superfamily. CYP21A2 p.Gln319 * was the most prevalent mutation in our study. Steroid 21-hydroxylase deficiency due to CYP21A2 gene mutations, including p.Gln319 * , is seen in more than 90% of all congenital adrenal hyperplasia cases (Prado et al., 2017). We hypothesized that CYP21A2 p.Gln319 * may affect the metabolism of tobacco and participate in the development of ESCC (Supplementary Figure 3). It is interesting that the inhibition of CYP21A2 by 1 µM abiraterone was observed in a FIGURE 5 | Correlation between differentiation and HRD VUS status. Patients who carried HRD VUS were more likely to have highly differentiated tumors than patients who did not (P = 0.003). recent study (Malikova et al., 2017). Song et al found 3 SNPs in CYP19A1, which also encodes a cytochrome P450 enzyme, that were associated with a FH of UGI cancer in ESCC cases (P<10 −5 ) in a meta-analysis performed by the National Cancer Institute (NCI) and in the Henan GWAS data (Song et al., 2017). As described above, crosstalk between CYP21A2 p.Gln319 * and smoking may promote ESCC progression. The function of CYP21A2 in esophageal cancer requires further investigation, including in vitro and in vivo experiments.
We found that individuals who carried HRD VUS were more likely to have highly differentiated tumors than those who did not. Defects in the homologous recombination (HR) DNA repair pathway sensitize tumors to therapeutics that target this pathway. In breast cancer, patients with mutations in the HR pathway are more likely to achieve a pathologic complete response (pCR) than non-deficient patients (Telli et al., 2018). The literature shows that as many as one in four ovarian cancer cases exhibit germline mutations, the majority of which result in HRD (Frey and Pothuri, 2017). Our results were consistent with research in breast and ovarian cancer. The current study is the first report of HRD VUS and differentiation status in ESCC. Additional in-depth studies with larger numbers of cases are needed to investigate HRD VUS in ESCC.A high TMB is an emerging biomarker of sensitivity to immune checkpoint inhibitors and has been shown to be significantly associated with a response to PD-1 and PD-L1 blockade immunotherapy (Le et al., 2015;Rizvi et al., 2015). Chalmers ZR et al. found that recurrent promoter mutations in PMS2 are highly associated with increased TMB across a diverse cohort of 100,000 cancer cases with over 100 tumor types (Chalmers et al., 2017).
In our study, we found that individuals who carried MMR VUS have higher TMB than those who did not. E09211 and E12230, who carried MSH3 p.A57_A58delinsPP, had TMB of 5.82 and 4.47, respectively. E10308, who carried PMS2 p.S587N and p.V302F, had a TMB of 6.47. DNA mismatch repair (MMR) has been widely documented to play a vital role in the development of a variety of cancers. Studies have shown that both the loss of expression and overexpression of mismatch repair genes, including MLH1, MSH2, and PMS2, can be deleterious to genomic stability (Thibodeau et al., 1996;Qin et al., 1999;Gibson et al., 2006). Our results were consistent with those of previous studies, which demonstrated that loss-of-function mutations in mismatch repair pathway genes correlate with a high TMB (Duval and Hamelin, 2002;Peltomäki, 2003).
In conclusion, our results identified the profile of germline mutations that predispose individuals to ESCC, providing novel insights into the molecular pathogenesis of ESCC. These results may potentially play key roles in the prevention, diagnosis, and risk management of ESCC. A large sample size with a long followup is required in the near future to validate our observations.