Genomic Profiling Reveals the Molecular Landscape of Gastrointestinal Tract Cancers in Chinese Patients

Gastrointestinal tract cancers have high incidence and mortality in China, but their molecular characteristics have not been fully investigated. We sequenced 432 tumor samples from the colorectum, stomach, pancreas, gallbladder, and biliary tract to investigate cancer-related mutations and detail the landscape of microsatellite instability (MSI), tumor mutation burden (TMB), and chromosomal instability (CIN). We observed the highest TMB in colorectal and gastric cancers and the lowest TMB in gastrointestinal stromal tumors (GISTs). Twenty-four hyper-mutated tumors were identified only in colorectal and gastric cancers, with a significant enrichment of mutations in the polymerase genes (POLE, POLD1, and POLH) and mismatch repair (MMR) genes. Additionally, CIN preferentially occurred in colorectal and gastric cancers, while pancreatic, gallbladder, and biliary duct cancers had a much lower CIN. High CIN was correlated with a higher prevalence of malfunctions in chromosome segregation and cell cycle genes, including the copy number loss of WRN, NAT1, NF2, and BUB1B, and the copy number gain of MYC, ERBB2, EGFR, and CDK6. In addition, TP53 mutations were more abundant in high-CIN tumors, while PIK3CA mutations were more frequent in low-CIN tumors. In colorectal and gastric cancers, tumors with MSI demonstrated much fewer copy number changes than microsatellite stable (MSS) tumors. In colorectal and gastric cancers, the molecular characteristics of tumors revealed the mutational diversity between the different anatomical origins of tumors. This study provides novel insights into the molecular landscape of Chinese gastrointestinal cancers and the genetic differences between tumor locations, which could be useful for future clinical patient stratification and targeted interventions.


INTRODUCTION
Gastrointestinal (GI) tract cancer refers to a group of cancers affecting the GI and accessory digestive organs, such as the pancreas, liver, gallbladder, and biliary ducts. GI cancers account for almost 30% of all cancer incidences, and 38% of global cancer-related mortality (Bray et al., 2018). GI cancers are difficult to diagnose at early stages due to the lack of symptoms, resulting in limited treatment options for such patients. In the past few years, extensive efforts have been made toward the molecular characterization of GI cancers for the development of novel diagnostic and treatment strategies (Cancer Genome Atlas Network, 2012;Cristescu et al., 2015;Guinney et al., 2015;Liu et al., 2018).
Driver mutations in TP53, APC, KRAS, BRAF, and PIK3CA are recurrent in GI cancers, but the mutation frequencies vary between different tumor types (Cancer Genome Atlas Network, 2012;Cancer Genome Atlas Research Network, 2014Wardell et al., 2018). Genome-level evaluations revealed distinct genomic statuses in GI cancers, including in terms of genomic stability (GS), chromosomal instability (CIN), and microsatellite instability (MSI) that potentially facilitate clinical treatment options (Cancer Genome Atlas Research Network, 2014;Liu et al., 2018). CIN is defined as whole-chromosome mis-segregation that results in the loss or gain of large chromosomal fragments, and is positively correlated to tumor metastasis, poor prognosis, and treatment resistance (Lee et al., 2011;Pikor et al., 2013;Bakhoum and Cantley, 2018). Conversely, MSI, which is characterized as high numbers of mutations in microsatellite repeats, is associated with increased intratumor immune infiltration and better prognoses (Pages et al., 2005(Pages et al., , 2008. Thus, newly identified tumor biomarkers are significantly changing the tumor staging systems and treatment landscapes for GI cancers. In China, stomach and colorectal cancers are the leading causes of death, immediately after lung and liver cancers (Chen et al., 2016). However, pancreatic, gallbladder, and biliary duct cancers are relatively rare, but their prognoses are much worse, and the availability of non-surgical treatments is limited. Compared to Western countries, the high smoking rate, the prevalence of Helicobacter pylori infections, heavy alcohol consumption, and poor nutrition are the factors contributing to the high incidences of digestive system cancers in China (Gu et al., 2018). Here, we report the molecular profiles of 423 GI tumors using targeted gene sequencing, including those of gastric cancer (GAST), colorectal cancer (CORE), pancreatic cancer (PAAD), gallbladder and biliary tract cancer (GABI), and gastrointestinal stromal tumors (GISTs). Recurrent somatic mutations and copy number variations (CNVs) were identified and compared across different tumors or sub-locations of the same tumors. The tumor genomes were also detailed for MSI, tumor mutation burden (TMB), and chromosome instability (CIN) that are closely related to treatment selection and prognostic predictions.

Patient Recruitment and Tumor Sample Collection
The study cohort was identified from several key cancer hospitals in Jiangsu province, China. All patients were diagnosed with GAST, CORE, PAAD, GABI, or GIST between February 2015 and February 2017, and submitted tumor tissues for clinical tumor genetic testing to assist with clinical decision-making. Clinical information and the genetic testing data of these patients were retrospectively collected from their registration forms. TNM status of each patient was defined at the time of genetic sequencing, rather than at the initial diagnosis. Primary tumor sites were used to divide tumors into different sub-locations for cross-comparisons. For colorectal cancer, right-sided tumors included tumors in the proximal two-thirds of the transverse colon, ascending colon, and cecum, while left-sided tumors included tumors in the distal one-third of the transverse colon, descending colon, and rectum.
For each patient, formalin-fixed paraffin-embedded (FFPE) and matched whole blood samples were submitted for targeted next-generation sequencing using a customized panel of 416 cancer-related genes, as described in previous reports (Yang et al., 2018). The study methodologies conformed to the standards set by the Declaration of Helsinki and were approved by the Ethics Committee of Jiangsu Province Cancer Hospital. All patients provided informed written consent. All samples were tested in a certified genomic testing facility (Nanjing Geneseeq Technology Inc., Nanjing, China).

DNA Library Preparation and Next-Generation Sequencing
DNA extraction was performed using the same protocols as in our previous publications (Shu et al., 2017;Yang et al., 2018). In brief, FFPE DNA was purified using the QIAamp DNA FFPE Tissue Kit (Qiagen, Hilden, Germany) and genomic DNA from white blood cells was extracted using the DNeasy Blood & Tissue Kit (Qiagen), following the manufacturer's protocols. DNA samples were quantified using the dsDNA HS Assay Kit on a Qubit 3.0 Fluorometer (Life Technologies, Carlsbad, CA, United States).
The KAPA Hyper Prep Kit was used to prepare sequencing libraries (KAPA Biosystems, Wilmington, MA, United States), as described previously (Yang et al., 2018). Libraries were then PCR amplified and purified before target enrichment. As described previously (Yang et al., 2018), during enrichment, indexed DNA libraries were pooled to up to 2 µg of total input and then subjected to hybridization capture of the targeted gene regions using custom DNA probes (Integrated DNA technologies, San Jose, CA, United States). After target enrichment, libraries were sequenced on the HiSeq4000 platform (Illumina, San Diego, CA, United States) with 2 × 150 bp pair-end reads.

Data Processing
Sequenced reads were analyzed by Trimmomatic (Bolger et al., 2014) to remove low-quality (quality < 15) or N bases and mapped to the human reference genome Human Genome version 19 (hg19) using the Burrows-Wheeler Aligner (BWA) (Li and Durbin, 2009). Picard was used to remove PCR duplicates, and the cutoff for qualified sequences of the tumor tissues was a mean coverage depth of >100 × after removing PCR duplicates. The Genome Analysis Toolkit (GATK) was used to perform local realignments around insertions/deletions (indels), base quality reassurance, and the discovery of germline variations (DePristo et al., 2011). Somatic single nucleotide variants (SNVs) and small indels were called by VarScan2 (Koboldt et al., 2012) and HaplotypeCaller/UnifiedGenotyper in GATK. Somatic variant calls with at least a 1.0% mutant allele frequency (MAF) and with at least three supporting reads in both directions were retained. Common variants were removed using dbSNP and the 1000 Genome project. Annotation was performed using ANNOVAR (Wang et al., 2010). Gene fusions were identified by FACTERA (Newman et al., 2014) and manually inspected using the Integrative Genomics Viewer (IGV). To identify somatic CNVs, CNVkit was used to analyze segmentations (Amarasinghe et al., 2013) and the results were fed into the GISTIC algorithm to identify recurrent focal and arm-level CNVs with a cutoff q-value of 0.25 (Mermel et al., 2011). TMB was determined based on the number of somatic base substitutions and indels in the targeted regions of the gene panel covering 1.2 Mbp of coding genome. In agreement with previous publications (Goodman et al., 2017), hyper-mutated tumors were defined as tumors with a TMB of >20 mutations/Mb (Supplementary Figure 2). Actionable mutations were defined by the Database of Evidence for Precision Oncology (DEPO) (Sun et al., 2018), including missense, in-frame and frameshift indels, splice site variations, and stop-gain mutations.
The MSI of each sample was determined by evaluating 52 embedded mononucleotide repeats with a minimum of 15-bp repeats that were included in the sequencing panel. The baseline length distribution of each repeat was determined from a pool of microsatellite-stable samples. A sample was identified as MSI if more than 45% of the qualified sites displayed instability.
The data from genome segments inferred by CNVkit were used to analyze the CIN score, which was defined as the proportion of the genome with aberrant segmented copy numbers. DNA segments with a log2 ratio below −0.2 or above 0.2 were considered as exhibiting copy number variance (CNV). The proportion of such segments in all of the covered regions of the genome was calculated as the CIN score. High CIN samples were defined as the upper 25% in all tumors, while low CIN samples were defined as the lower 25% in all tumors.

Clinical Characteristics of the Enrolled Patients and Samples Submitted for Sequencing
All 432 patients had at least one qualified tumor sample submitted for targeted NGS, and 18 individuals were excluded from the final analysis due to the lack of mutations in any of their tumor samples (Supplementary Figure 1). Finally, 414 patients, including CORE (n = 207), GAST (n = 144), PAAD (n = 27), GABI (n = 14), and GIST (n = 22) had adequate tumor tissue samples sequenced and were analyzed for somatic missense mutations, small indels, CNVs, and chromosomal rearrangements. The clinical characteristics of these patients are summarized in Supplementary Tables 1-5.
Key cancer-associated genes covered by the sequencing panel were classified into nine canonical signaling pathways responsible for cellular proliferation, as previously described (Sanchez-Vega et al., 2018; Supplementary Table 6). We computed the fraction of samples with at least one gene altered in each pathway, and found that p53, RTK-RAS, and Wnt were the most frequently mutated pathways, while MYC, cell cycle, and Hippo pathways had the lowest mutation ratio in nearly all cancers ( Figure 1C). The Wnt pathway was mutated more frequently in CORE (75%) than in other cancers (p < 0.05), and the RTK pathway was altered extensively in both GIST (82%) and PAAD (89%). For the Wnt pathway, apart from APC (60% of CORE), other genes, including RNF43 (14%), AMER1 (9%), AXIN2 (6%), CTNNB1 (5%), CHD4 (4%), and LZTR1 (4%), were also mutated in CORE with different frequencies, which raises therapeutic opportunities by inhibiting the Wnt pathway. GIST had the highest mutation frequency of the RTK pathway, and the lowest mutation frequency of the p53 pathway (9%, p < 0.05) among all cancer types ( Figure 1C). In the RTK pathway, KIT was the most frequently altered gene in GIST (73%), while KRAS alterations were predominantly found in CORE (47%), GAST (11%), GABI (21%), and PAAD (74%).
When stratifying all patients based on the existence of actionable mutations, 47% of all cases harbored at least one actionable mutation (Supplementary Table 7). The highest frequencies were found in PAAD (77%) and GIST (77%) due to the extensive mutations in KRAS and KIT (Supplementary Figure 4A). A total of 56% of CORE cases and 24% of GAST cases were defined as actionable, with hotspot mutations at KRAS p.G12/G13/Q61 and PIK3CA p.E542/E545/H1047R being the most dominant (Supplementary Figure 4B). In addition, BRAF p.V600E, p.G466V, and p.L597R, as well as BRCA2 mutations, , pairwise comparisons were conducted between every two groups using the Fisher's exact test. FDR was used for p-value corrections. For (D), one-way ANOVA on ranks test was used to compare all groups, and the Dunn's test was used for post hoc analyses. *p < 0.05; ***p < 0.001.

Tumor TMB and CIN Indicate New Treatment Strategies
The TMB of all patients ranged between 1 and 185 (median: 5), with the highest median in CORE (median: 6) and the lowest median in GIST (median: 2.5, Figure 1D). However, both CORE (9%) and GAST (3%) had a small high-TMB population, named hyper-mutated tumors in this study (n = 26, Figure 1D and Supplementary Figure 2). Of these 24 hyper-mutated tumors, 23 (92%) had at least one somatic or germline mutation in the MMR genes, including MLH1, MSH2, MSH6, PMS1, and PMS2, or the polymerase (POL) genes, including POLE, POLD1, and POLH. In low-mutation tumors, only 30% of tumors (117 out of 388 tumors) had mutations in the MMR and POL genes (Figure 2A). Somatic mutations in MSH2, MSH6, MLH3, PMS1, PMS2, POLE, and POLD1 were significantly higher in hyper-mutated tumors than low-mutation tumors (FDR < 0.01). POLE and POLD1 mutations (including somatic and germline) in hyper-mutated tumors were dispersed across all domains of the two POLs ( Figure 2B), with only POLE p.P286R and POLD1 p.R689W being reported to have functional disruption (Ahn et al., 2016;Mertz et al., 2017). However, we observed that a POLE p.A1778V mutation in patient #GA_59, who developed gastric cancer at age 76 with a TMB of 45 (71% were missense mutations), was the only mutation in the polymerases and MMR genes in this patient, suggesting that this mutation might impair POLE function. We also identified a novel mutation in POLD1 from three hypermutated patients (#CO_129, CO_26, and CO_273), a somatic splicing variant c.2954-1delG that disrupts the zinc finger domain of POLD1 and can be potentially harmful. Notably, in lowmutation tumors, somatic and germline mutations in the MMR and POL genes were almost exclusive from each other, while in the hyper-mutated group, the total number of mutations in the MMR and POL genes were significantly higher (p < 0.01, Figure 2C).
Another genome marker that we inspected was CIN. CIN is a critical hallmark of cancer and is closely related to tumor metastasis, treatment resistance, and poor prognosis  (Pikor et al., 2013;Bakhoum and Cantley, 2018). The CIN score was used to measure the extent of copy number changes in large segments in an individual tumor. The CIN score ranged widely in each cancer ( Figure 3A). The median CIN was relatively higher in GIST (0.40), CORE (0.31), and GAST (0.27) tumors, and lower in GABI (0.18) and PAAD (0.11) tumors ( Figure 3A). The mechanisms causing CIN have not been fully elucidated. It was suggested that chromosome segregation genes and cell cycle genes were widely related to CIN (Maleki and Rocken, 2017).
We compared the mutation frequencies between high-CIN and low-CIN tumors to identify the associated gene alterations ( Figure 3B). TP53 was significantly enriched in high-CIN tumors (FDR < 0.01), which was consistent with previous reports of mitotic stress caused by TP53 malfunctions (Malumbres, 2011). Gene mutations and copy number changes that were significantly different between the high-CIN and low-CIN groups. For (A), the one-way ANOVA on ranks test was used to compare all groups, and the Dunn's test was used for post hoc analyses. *p < 0.05; ***p < 0.001. For (B), pairwise comparisons were conducted between every two groups using the Fisher's exact test. FDR was used for p-value correction.
In addition, we also observed broad copy number loss of WRN, NAT1, NF2, and BUB1B, as well as copy number gain of MYC, ERBB2, EGFR, and CDK6 in high-CIN tumors (FDR < 0.01, Figure 3B). The copy number loss of WRN and NAT1 were almost concurrent, possibly because of their adjacent genomic locations. PIK3CA is the only signature that was significantly enriched in low-CIN tumors (FDR < 0.1, Figure 3B).

Colorectal and Gastric Cancers Showed Location-Specific Gene Alterations
In order to investigate the interethnic differences, we compared the prevalence of somatic mutations between our colorectal cancer cohort (n = 207) and the Memorial Sloan Kettering Cancer Center (MSKCC) metastatic colorectal cohort (n = 985) (Yaeger et al., 2018). The two groups had comparable clinical features with respect to patients' ages, gender, and the primary tumor locations. However, our cohort had more stage IV disease patients, at 80.3% vs. 61.7% in MSKCC cohort (Supplementary Table 1). TP53 was the most frequently mutated gene in both cohorts, but in our cohort, the mutation ratio was significantly higher (81% vs. 73%, FDR = 0.02), which is consistent with its presence in more advanced diseases (Yaeger et al., 2018). Conversely, FBXW7, whose mutations were suggested to be enriched in early-stage tumors (Yaeger et al., 2018), was mutated less in our cohort ( Figure 4A). We also found that APC alterations were less frequent in our cohort (60% vs. MSKCC 75%, FDR = 0.0007), while another Wnt pathway driver, RNF43, was more frequently mutated (14% vs. MSKCC 8%, FDR = 0.07). Other genes that were increasingly mutated in our cohort were GNAS (11%), POLE (9%), NF1 (9%), and ERCC2 (5%), while SMAD2 (1%) was mutated significantly less frequently (FDR < 0.1).
We classified our cohort into highly microsatellite-instable (MSI-H, n = 11) and microsatellite-stable (MSS) tumors given their MSI status identified by the embedded microsatellite sequences in the targeted sequencing panel (Figure 4B). The MSS group was further divided into right-sided (n = 48) and left-sided (n = 149) tumors for comparison. The incidence of the left-sided MSS tumors (72%, n = 149) was much higher than the rightsided MSS tumors (22%, n = 48), while 10 cases were without tumor location information. As expected, DNA mismatch repair (MMR) genes were more frequently mutated in MSI-H tumors, compared to MSS tumors. In the MSI-H group, we observed an enrichment of somatic mutations in a number of genes, and the top affected genes were ARID1A (mutation frequency: 91%), RNF43 (82%), GNAS (73%), KMT2B (73%), PIK3CA (64%), POLE (64%), AXIN2 (64%), and SMARCA4 (64%), because of the existence of short tandem repeats in gene sequences that can be easily affected by MMR gene defects. However, CNVs were scarcely observed in MSI-H tumors ( Figure 4B).
In MSS tumors, gene alterations were imbalanced between the right side and left side. The left-sided tumors were characterized by higher levels of TP53 mutations (87%), while the right-sided tumors exhibited higher levels of KRAS (43%) and CTNNB1 (2%) mutations (FDR < 0.1). Meanwhile, copy number loss of CDKN2A/CDKN2B was significantly more prevalent in rightsided tumors (Figure 4B, 13% vs. 2% in left-sided tumors, FDR < 0.1). The degree of CIN in MSI-H tumors (median of CIN score: 0.11) was significantly lower than in MSS tumors (median 0.32, p = 0.0036), and the scores were similar between the right-sided and left-sided tumors ( Figure 4C).
The prevalence of gene alterations in gastric cancer was also compared to the MSKCC cohort (n = 81), and revealed a higher mutation frequency in TP53 and lower mutation frequency in PBRM1 and ERBB3 ( Figure 5A). Similar to colorectal cancer, gastric cancer was first classified into MSI-H (n = 5) and MSS (n = 119) cases, and then MSS cases were further grouped based on the locations of primary tumors, including cardia (n = 25), fundus and body (n = 65), and pylorus and duodenum (n = 29). As expected, MSI-H cases showed markedly increased frequencies of somatic mutations and decreased CNVs compared to MSS cases ( Figure 5B). RNF43 and KRAS mutations were primarily observed in pylorus duodenum regions, while CCNE1 amplifications were prevalent in the upper portion of the stomach (cardia, fundus, and body). KRAS and ERBB2 mutations were not present in any of the MSI-H patients ( Figure 5B). Consistent with a previous report (Cancer Genome Atlas Research Network, 2014), the cardia section of the stomach has relatively high CIN scores and the median is gradually reduced from the upper stomach to the bottom stomach ( Figure 5C).

DISCUSSION
Herein, we performed a comprehensive genetic analysis of different GI cancers. This is the first large-scale study of GI cancers in Chinese patients that implemented a uniform genetic testing and data analysis pipeline. The results highlighted the similarities and differences in the genetic landscapes of GI cancers and also informed on the status of several biomarkers for cancer treatment, including MSI status, TMB, and CIN between different cancer types. The high mutation frequency of TP53, APC, and KRAS in CORE and that of KRAS in PAAD were reported in multiple other studies (Vogelstein et al., 1988;Kinzler and Vogelstein, 1996;Bardeesy and DePinho, 2002). However, the frequency of TP53 alterations was higher in our CORE cohort than in other reported populations (Olivier et al., 2002;Abubaker et al., 2008). Such a finding might be due to the presence of more advanced disease stages at the time of diagnosis in this study. Mutations in the KIT gene were recognized as a relatively early event in GIST tumorigenesis, while TP53 mutations were related to the malignant transformation of GIST (Ryu et al., 2004).
Although we lacked pathological stage information of the GIST patients in our cohort, the high KIT mutation frequency and low TP53 mutation frequency that we observed suggested that GIST patients were at an early disease stage. This assumption was also supported by the fact that we did not observe high frequencies of RB1 mutations in the GIST population, which was an event that might be restricted to malignant GISTs (Merten et al., 2016).
Significantly aberrant Wnt signaling was observed in CORE compared to other cancer types, and this pathway has been closely linked to carcinogenesis (Mirabelli et al., 2019). Many inhibitors targeting the Wnt signaling pathway are being examined in different clinical trials, including porcupine (PORCN) inhibitors, WNT ligand antagonists, and FZD antagonists/monoclonal antibodies (Jung and Park, 2020). In Chinese CORE and GAAD patients, both APC and RNF43 were predominantly mutated in the Wnt pathway. A significant subset of patients had nonsense or frameshift alterations in RNF43, particularly high frequencies in the MSI-H group. Such mutations were also mutually exclusive with APC alterations. As a tumor suppressor, RNF43 has shown its capacity to negatively regulate Wnt signaling (Koo et al., 2012;Loregger et al., 2015). Recent studies found that depletion of RNF43 enhanced tumor growth in GI cancers and conferred resistance to DNA-damageinducing chemotherapies and γ-radiation in gastric cancer cells (Neumeyer et al., 2019(Neumeyer et al., , 2020. Additionally, preclinical cancer models have shown the responsiveness of RNF43 mutations to Wnt inhibitors, several of which are in clinical trials (Janku et al., 2015(Janku et al., , 2020Yu et al., 2020). Therefore, screening for RNF43 mutational status could direct therapy selections for GI cancer treatments.
Among all cancer types, CORE and GAST demonstrated significantly higher TMB than others, while GIST demonstrated relatively higher CIN scores. In both CORE and GAST, a small group of patients were characterized by MSI-H, and their CIN levels were correspondingly lower than those of the MSS groups, thus suggesting that tumors obtain a survival advantage through either high mutational loads or high levels of somatic copy number alterations (SCNA). Both TMB and MSI are emerging biomarkers for immune checkpoint inhibitors and CIN has the potential to drive tumor evolution and treatment resistance (Jin et al., 2020). CIN, which was characterized by increasing the mis-segregation of chromosomes, can be induced by defects in the mitotic spindle assembly checkpoints, cell cycle regulation, multipolar spindles, or DNA damage responses Bakhoum et al., 2012). The acquisition of CIN is an essential feature in cancer pathogenesis and is considered as compensation for a lack of driver mutations (Turajlic et al., 2018). CIN is also considered a drug-resistant mechanism during cancer treatment and negatively correlates with the progressionfree survival and overall survival of cancer patients (Turajlic et al., 2018;Jin et al., 2020). In our cohort, we observed a high level of TP53 mutations in high-CIN patients, while PIK3CA alterations were significantly enriched in low-CIN patients, with a tendency for mutual exclusivity with TP53 mutations. These findings are consistent with previous reports that TP53 inactivation results in CIN tolerance in cells Matano et al., 2015).
Although PIK3CA acts independently of TP53 inactivation to support CIN tolerance, it generally precedes the genome doubling event (Carter et al., 2012;Zack et al., 2013;Berenjeno et al., 2017). The GIST population has the highest median level of CIN, despite its low mutation frequency in TP53 and the cell cycle pathway compared to that observed in other cancer types. A further look at the high-CIN and median-CIN groups of GIST identified a much higher level of NF2 copy number deletion in the high-CIN group than the low-CIN group. NF2 inactivation has been linked to increased CIN in meningiomas (Goutagny et al., 2010;Dewan et al., 2017), but for the first time, we report that its copy number deletion is potentially associated with high CIN level in GIST. However, this finding must be validated in a much larger cohort of GIST samples.
Although CIN potentially drives tumor evolution and drug resistance via the production of oncogenic SCNA, excessive levels of CIN were proven to be detrimental to tumor growth (Roylance et al., 2011;Janssen and Medema, 2013), thus creating the opportunity for developing therapies aimed at increasing the CIN level of tumors. Currently, a few agents targeting Mps1/TTK kinase to induce CIN have been evaluated in phase I clinical trials, including BAY1217389 (NCT02366949), BAY1161909 (NCT02138812), and BOS172722 (NCT03328494; clinicaltrials.gov). However, the success of this strategy relies on patient stratification based on their CIN levels, and the coexistence of gene alterations (e.g., TP53 or PIK3CA) that can reduce the toxicity of elevated CIN.
Malfunctioning DNA repair mechanisms caused by somatic mutations in MMR genes is common in cancer and contributes to MMR deficiency, and high TMB and MSI phenotypes (Bodor et al., 2018). Indeed, the majority of the hyper-mutated tumors in our cohort were observed to have somatic mutations in MMR genes. MSI-H tumors were also found to have a higher ratio of MMR gene mutations compared to MSS tumors. Recent studies have suggested that mutations in DNA polymerase (POL) genes are other factors that are associated with a hyper-mutated tumor phenotype, especially in colon and rectal cancers (Cancer Genome Atlas Network, 2012). Interestingly, in hyper-mutated tumors, we observed concomitant somatic or germline mutations in the MMR and POL genes, with a median of three MMR/POL mutations per tumor, which was significantly higher than that of low-mutation tumors (a median of one MMR/POL mutation per tumor). However, currently only a few non-synonymous mutations in the exonuclease domains (EDM) of POLE (residues 268-471) and POLD1 (residues 304-517) have been considered pathogenic (Briggs and Tomlinson, 2013), while most others are classified as variants of unknown significance. Increasing evidence suggests that patients with POLE EDMs are prone to higher TMBs and an upregulation of immune checkpoint genes, which could potentially benefit from immune checkpoint inhibitors (Snyder et al., 2014;Rayner et al., 2016;Mo et al., 2020).
The limitations of this study included the lack of clinical treatment and prognostic information, which is typical in any retrospective study. Therefore we are unable to determine the treatment outcomes that were potentially linked to the genomic findings of different cancer types. However, our analysis of a large cohort of advanced GIs revealed the landscape of genetic alterations, highlighted the genomic differences between tumor locations, such as between right-and left-sided CRC, and identified the unique molecular features in Asian GI cancer patients.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethics Committee of Jiangsu Province Cancer Hospital. Written informed consent to participate in this study was provided by the participants or their legal guardian/next of kin.

AUTHOR CONTRIBUTIONS
JF, PL, CZ, LZ, and YG: conceptualization and critical thinking. XT, GW, HB, XM, RY, and XW: perform experiments and data analysis. CZ, LZ, YG, GW, WZ, WS, and DZ: clinical sample collection and pathological analyses. CZ, LZ, YG, XT, and RY: manuscript drafting. JF, PL, CZ, LZ, YG, XT, RY, and XW: manuscript review and editing. All authors contributed to the article and approved the submitted version.