Germline Profiling and Molecular Characterization of Early Onset Metastatic Colorectal Cancer

Background Early onset colorectal cancer (EO CRC) is a heterogeneous colorectal cancer subtype with obvious hereditary tendencies and increasing incidence. We sought to determine the susceptibility genes and molecular characteristics of EO CRC. Methods 330 EO metastatic CRC (mCRC) (≤55 years) and 110 average-onset (AO) mCRC patients (>55 years) were enrolled. Capture-based targeted sequencing was performed on tumor tissue and paired white blood cells using a sequencing panel of 520 genes. The association between molecular alterations and overall survival (OS) was analyzed. Results Of the 330 EO mCRC patients, 31 carried pathogenic or likely pathogenic germline mutations, with 16 of them diagnosed with lynch syndrome. Fifteen patients had germline mutations in non-mismatch repair genes, including four in MUTHY, three in RAD50, one in TP53, and eight in other genes. Twenty-nine genes were recurrently mutated in EO mCRC, including TP53, APC, KRAS, SMAD4, and BRCA2. The majority of genomic alterations were comparable between EO and AO mCRC. EO mCRC patients were more likely to have a high tumor mutation burden (p < 0.05). RNF43, RBM10, TSC, and BRAF V600E mutations were more commonly observed in EO mCRC, while APC, ASXL1, DNMT3B, and MET genes were more commonly altered in AO patients. At the pathway level, the WNT pathway was the only differentially mutated pathway between EO and AO mCRC (p < 0.0001). The wild-type WNT pathway (p = 0.0017) and mutated TGF-β pathway (p = 0.023) were associated with unfavorable OS in EO mCRC. Conclusions Approximately one in 10 EO mCRC was associated with hereditary tumors. The spectrum of somatic alterations was largely comparable between EO and AO mCRC with several notable differences.


INTRODUCTION
Colorectal cancer (CRC) is the fourth most common malignancy and the second leading cause of tumor-related death worldwide (1). CRC is traditionally considered a malignancy that mostly affects individuals over 50 years old, whereas approximately 10% of CRC is diagnosed in individuals before the age of 50 years (2,3). The incidence and mortality of colorectal cancer have declined globally due to the implementation of routine screening and the advancement of precise treatments (4). However, the age-specific incidence rate among population aged less than 55 years is increasing annually since mid-1990s (5). And mortality rates of colorectal cancer patients aged 20 to 54 years increased by 1.0% annually from 2004 to 2014 (6). Previous studies have shown that early onset colorectal cancer (EO CRC) is often associated with a later stage at presentation, signet ring histology, distal primary tumors, unfavorable prognosis, and strong inherited predisposition compared with average onset colorectal cancer (AO CRC), suggesting that EO CRC might be a distinctive subtype of colorectal cancer (7,8).
Hereditary cancer syndromes caused by germline mutations in highly penetrant genes account for approximately 2-5% of colorectal cancers (9). With the advent of next generation sequencing (NGS), more tumor susceptibility genes have been identified in CRC. The prevalence of hereditary syndrome in EO CRC ranges from 10 to 35%, which is higher than that in the AO CRC population (9)(10)(11). Inherited colorectal tumors, including hereditary non-polyposis colorectal cancer, adenomatosis coli, and suspected HNPCC, accounted for 38.4% of CRC patients under the age 40 years, 17.1% of patients aged 40-50 years, 10.2% of patients aged 50v55, and only 3.5% of individuals older than 55 (12). Furthermore, it should be noted that inherited predisposition alone cannot fully account for the increasing mortality rates in young CRC patients. Comprehensive molecular characterization is required to advance our understanding of the molecular profile of EO CRC, especially metastatic disease which is more challenging in terms of treatment. However, relevant studies are still limited and conflicting results exist in the published literature. In this study, considering the increasing trend of incidence and mortality rates and marked inherited tendency in patients younger than 55 years, we evaluated both the germline and somatic mutation spectrum of 330 early onset metastatic CRC patients using capture-based targeted sequencing to provide clinically actionable therapeutic information and investigate the genomic differences between Chinese EO CRC and Western EO CRC as well as AO CRC patients. We also aimed to explore the prognostic molecular features of EO mCRC.

Patients and Sample Collection
We conducted a retrospective study on individuals diagnosed with metastatic CRC before or at the age of 55 years who underwent clinical care at the Peking University Cancer Hospital between Jan 2010 and Dec 2018. We enrolled 330 patients who met the following criteria: 1. Diagnosed with metastatic CRC before or at the age of 55 years; 2. Sufficient archived FFPE tumor tissue and paired white blood cell samples for sequencing. In addition, we also enrolled 110 patients with metastatic AO CRC (diagnosed after the age of 55 years) from Burning Rock Dx database for comparative purposes. Tissue samples via biopsy or resection and paired white blood cell samples were also obtained from each AO CRC patient. Electronic medical record and telephone interviews were used to obtain information about demographics, family history, tumor location, histology, treatment history, and survival status. This study was approved by the Ethics Committee of the Peking University Cancer Hospital. Informed consent was obtained from each patient.

Sequencing Panel
The panel used in our study covered 520 cancer-related genes, including all targets of current standard of care targeted therapies, spanning 1.64 mega bases (Mb) of the human genome (OncoScreen Plus, Burning Rock Biotech, Guangzhou, China). Whole exons from 312 genes and critical exons, introns, and promoter regions of the remaining 208 genes as indicated in Table S1 were included in our panel. The 98 cancer susceptibility genes included in the germline mutation analysis are marked in bold in Table S1.

DNA Extraction
Genomic DNA was extracted from FFPE samples using the QIAamp DNA FFPE tissue kit (Qiagen, Carlsbad, CA, USA) according to the manufacturer's instructions. DNA concentrations were measured using the Qubit dsDNA assay (Life Technologies, Carlsbad, CA, USA).

Capture-Based Targeted Sequencing and Analysis
Library preparation was performed in the College of American Pathologists (CAP)-accredited/Clinical Laboratory Improvement Amendments (CLIA)-certified clinical laboratory of Burning Rock Biotech. A minimum of 50 ng of DNA was required for NGS library construction. The genes indicated in Table S1 were captured and sequenced using a Nextseq500 sequencer (Illumina, Inc., USA) with pair-end reads.
Sequence data were analyzed using proprietary computational algorithms that were optimized for accurate identification of somatic and germline variants, while discriminating sequencing artifacts from true positive mutations. Variants with a population frequency over 0.1% based on the ExAC, 1,000 Genomes, dbSNP, and ESP6500SI-V2 databases were grouped as SNPs and excluded from further analysis.

Tumor Mutation Burden Estimation
Tumor mutation burden was computed as the ratio between the total numbers of somatic mutations, including synonymous mutations, detected with the total coding region size of the panel used using the formula below. The coding region size of the panel used was 1.26 Mb for the 520-gene OncoScreen Plus panel, which excluded copy number variations, fusions, large genomic rearrangements, and mutations occurring on the kinase domain of EGFR and ALK.
Tumor mutation burden = mutation count (except for copy number variations and fusion) total size of coding region of the panel used

Microsatellite Instability Determination
All patients were subjected to MSI testing. The MSI (microsatellite instability) phenotype detection method used was a read-count distribution-based approach. It used the coverage ratio of a specific set of repeat lengths as the main characteristic of each microsatellite locus, and categorized a locus as unstable if the coverage ratio was less than a given threshold. The MSI status of a sample was determined by the percentage of unstable loci in a given sample. The MSI status of a tumor sample could then be determined based on the percentage of lengthinstable loci, without a paired normal control. A total of 63 marker loci were chosen for categorization. For each marker locus, a read-count histogram was constructed, and the coverage ratio of the reference length set was calculated and compared with the reference threshold. A locus with a coverage ratio less than [mean -3 × SD] of the reference ratio was considered a length-instable locus. A tumor sample was considered MSI-H (microsatellite instability-high) if more than 40% of the marker loci were length instable, MSS (microsatellite stability) if the percentage of length-instable loci were <15%, or MSI-L if the percentage was between 15 and 40%.

Statistical Analysis
Descriptive statistics were used to summarize the demographics, clinicopathologic characteristics, and family history. Chi-square or Fisher's exact tests were performed to compare categorical variables. Survival data were assessed by Kaplan-Meier estimates, and comparisons were made with log-rank tests. A p value < 0.05 was considered statistically significant. All analyses were done using R software, version 3.6.1.

Patient Characteristics
From 2010 to 2018, 330 EO mCRC patients diagnosed before or at the age of 55, and 110 patients with AO mCRC (diagnosed after the age of 55) were enrolled in this study. Table S2 summarizes the demographics and clinicopathological characteristics of early onset patients. The median age for the EO cohort was 45 years (ranging from 13 to 55) ( Figure S1).

Germline Genetic Feature of EO mCRC Patients
Capture-based targeted sequencing was performed by a CLIAcertified lab to interrogate the germline mutation landscape of EO mCRC patients, with 98 cancer susceptibility genes being analyzed. Among the 330 EO mCRC patients, 33 pathogenic or likely pathogenic germline mutations were detected in 31 patients (31/330, 9.4%), and 974 variants of uncertain significance (VUS) in 91 genes were found ( Figure S2). The germline mutation frequencies of patients younger than 35 years, 35-50 years, and 50-55 years were 22.2%, 7.0%, and 7.9%, respectively. Among the 31 patients, 16 were MSS, and the remaining 15 were MSI-H. Sixteen patients (4.8% of the entire cohort, 52% of patients with pathogenic or likely pathogenic germline mutations) had mutations in one of the DNA mismatch repair (MMR) genes (six in MLH1, three in MSH2, two in PMS2 and five in MSH6). Hence, they had been diagnosed with Lynch Syndrome (LS). One LS patient carrying an MSH6 c.3416del (p. G1139fs) germline mutation showed MSS. Of the 16 patients with LS, six patients had at least one first-degree relative with cancer, and four patients had at least one second-degree relative with cancer. In addition to the MMR genes, other frequently mutated genetic susceptibility genes included but were not limited to, MUTYH (n = 4, 1.2% of the entire cohort, 12% of patients with germline mutations), RAD50 (n = 3, 0.9% of the entire cohort, 9% of patients with germline mutations), TP53 (n = 1, 0.3% of the entire cohort, 3% of patients with germline mutations) and FANCL (n = 1, 0.3% of the entire cohort, 3% of patients with germline mutations) ( Table 1). No individual was found to carry pathogenic or likely pathogenic germline mutations in APC or POLD1/POLE in this cohort. We next compared the germline mutation spectrum of Chinese EO mCRC patients with a published Western cohort consisting of 430 Caucasian patients diagnosed with CRC before the age of 50 (11). Two hundred and twenty-six patients who were diagnosed before or at the age of 50 from our cohort were included in this comparative analysis. The prevalence of hereditary syndrome 11.1% (n = 25) in our cohort, which was significantly lower than the Western cohort (18%) (p = 0.018). Within hereditary cancer patients, the two cohorts had a comparable percentage of LS patients (64% for our cohort versus 71% for the Western cohort, p = 0.61). The distribution of germline mutations in MMR genes was also similar, except for MSH6 (6% for the Western cohort versus 22% for the Chinese cohort, p = 0.043). In addition, germline APC mutation, which was not observed in our cohort, accounted for 13% of the Western EO cohort patients. Other high and moderate penetrance genes, including BRCA1, SMAD4 and CHEK2, that were not observed in our cohort, were mutated in 16% of the Western hereditary EO CRC patients.

Somatic Mutation Spectrum of Early Onset Colorectal Cancer Patients
To investigate the somatic mutation landscape of EO mCRC, DNA extracted from 330 FFPE and paired WBC samples was subjected to next generation sequencing for a panel of 520 cancer related genes and MSI determination. Collectively, we identified 7,096 mutations, spanning 463 genes (21.5 variants per sample on average), including 4,945 single nucleotide variations, 66 insertions or deletions, 453 copy-number amplifications and 10 translocations (Figure 1). The median tumor mutational burden (TMB) was 6.7 (range: 0.8-563.5) mutations/Mb across all tumor samples. Thirty-three (10.0%) hypermutated samples (TMB ≥ 20 mutations/Mb) were identified in our cohort, including 27 (8.2%) MSI-H cases and five (1.5%) MSS cases with POLE mutations, (Figure 2). Of the five MSS cases with POLE mutations, three were identified with known POLE exonuclease domain mutations (S459F, P286R, and V411L/ D275G dual mutation) ( Table S3). One case with a TMB of 442 per Mb was found to have a POLE S459Y exonuclease domain mutation. This patient also harbored a heterozygous  October 2020 | Volume 10 | Article 568911 pathogenic MSH6 germline mutation but showed MSS. The remaining one case included a non-hotspot POLE R821C mutation (TMB 59.5 per Mb). Twenty-nine recurrently mutated genes (mutation frequency ≥ 5%) were identified in the EO mCRC cohort, including TP53 (80%), APC (60%), KRAS (51%), LRP1B (20%), SMAD4 (19%) and FAT3 (19%) (Figure 1). Except for TMB-H cases, actionable alterations that may confer sensitivity or resistance to specific targeted therapies were also identified, including 8.8% of tumors with the BRAF V600E mutation, 7.6% with ERBB2 amplification or mutation and 1.5% patients with MET amplification. PTEN and PIK3CA oncogenic mutations were identified in 7.0% and 13.3% of patients, respectively. Targetable fusion mutations were extremely rare and the only two receptor tyrosine kinase fusions were EML4-ALK, GOPC-ROS1. It is worth noting that all BRAF V600E mutant tumors were microsatellite stable in the EO mCRC cohort, in contrast to previous studies. Moreover, seven patients (2.1%) harbored class 3 BRAF mutations with impaired kinase activity.
With the goal of identifying the genomic heterogeneity of EO CRC patients, we selected 47 genes that were related to six critical pathways in CRC and further analyzed the alteration rates of the six driver pathways in EO CRC patients. Mutations were most frequently found in the p53 pathway (85.2%), followed by the MAPK pathway (69.7%), the WNT pathway (69.4%), the TGFb pathway (30.0%), the PI3K pathway (27.6%), and RTK genes (27.6%).
We further analyzed the mutation characterizations of the EO and AO mCRC patients at the pathway level. The mutation profile was similar between the EO and AO CRC cohorts except for the WNT pathway ( Figure 3B). WNT pathway alteration was enriched in AO CRC patients compared with EO CRC patients (100% versus 69.4%, p < 0.0001). P53 pathway alteration was found in 85.2% of EO CRC patients and 81.2% of AO CRC patients (p = 0.499), which was more common than the result of previous studies (64-69.0%) (13,15). There were no significant differences in the mutation frequencies of the MAPK, PI3K, or TGF-b pathways between EO and AO CRC patients.   Figure 4E). Within MAPK pathway genes, the BRAF V600E mutation was correlated with poorer overall survival compared with BRAF wild type and BRAF non-V600E mutants (13.3 versus 24.7 m p = 0.0017) ( Figure 4F). Neither the p53 pathway nor the PI3K pathway was associated with overall survival in EO mCRC patients.

DISCUSSION
Herein, we characterized the germline and somatic mutations of early onset metastatic colorectal cancer patients. In this clinic-based cohort of 330 mCRC patients diagnosed at or before the age of 55, 9.4% carried at least one pathogenic or likely-pathogenic germline mutation associated with tumor predisposition as demonstrated using multigene panel sequencing. The germline mutations in cancer susceptibility genes were less commonly seen in our cohort compared with previous studies of EO CRC, which may have partly resulted from racial differences and the late-stage cases chosen for Moreover, 11 of the 16 LS patients failed to fulfill the Amsterdam II criteria, which strongly suggested that clinical phenotypes and family history were far from sufficient for screening out individuals who need genetic analysis.
The spectrum of tumor susceptibility mutation in our study was different from previous studies. Moderate-penetrance monoallelic MUTYH mutation and APC germline mutations were relatively rare in Chinese EO mCRC. We found 3 patients with scarce compound heterozygotes composed of monoallelic mutations of MUTHY c.53C>T and c.74G>A, which had been previously reported in Japanese polyposis patients and Chinese colorectal cancer patients (16)(17)(18). Chen et al. reported that the frequency of this heterozygous haplotype variant allele was statistically higher in CRC patients than in healthy controls (4.35 versus 0.87%, p = 0.02). Thus, they concluded that this MUTHY variation is likely to be associated with colorectal cancer susceptibility (17). However, the frequency of monoallelic mutations of MUTHY c.53C>T and c.74G>A in our EO mCRC cohort (0.9%) was very close to that of the healthy controls in Chen's study (0.87%). Therefore, we classified monoallelic mutations of MUTHY c.53C>T and c.74G>A as variant of uncertain significance in this study. Many of the genes identified in our study have not been demonstrated to be associated with colorectal cancer risk, such as NBN, ATR, or RAD50. Notably, the majority of these patients lacked clinical phenotypes of corresponding hereditary syndromes. Multigene gene panel testing allowed identification of potential susceptibility genes, but additional evidence is needed to establish the relationship between these rare mutations and colorectal cancer.
It has been well-acknowledged that colorectal cancer is a heterogeneous disease with varied molecular mechanisms underlying it (13). Several different classification systems have been established to better understand the biological characteristics of colorectal cancer (19). Another important aspect of our study was that we profiled the unique molecular features of EO mCRC at both the single gene and pathway levels and furthermore evaluated their association with prognosis. The somatic mutation landscape of early onset and average onset mCRC was comparable on the whole. Several critical differences, however, were illustrated in this study. EO mCRC exhibited more TMB-H cases than AO mCRC. The main causes of TMB-H in colorectal cancer include POLE/POLD1 deficiencies and MSI-H resulting from MMR mutation or MLH1 promoter hypermethylation (20). In our EO CRC cohort, 10.0% of patients had a TMB ≥20 mutations/Mb among which 27 exerted MSI-H. The POLE gene is mutated in 2.31% of MSS EO mCRC patients, which was close to the mutation frequency in AO mCRC (Table S3) exonuclease domain mutation S459Y and a rare non-hotspot mutation R821C were found to be related to hypermutation in our study. There are several previously reported cases reported indicating that POLE R821C is associated with ultra-mutation (21). In previously published studies, the proportion of MSI-H in early onset CRC ranged from 15 to 41.0% (8). Moreover, POLE/ POLD1 mutation was identified in 0.65-12.3% of colorectal cancer patients and was reported to occur more frequently in EO CRC patients (13,22,23). The enrichment of MSI and POLE mutation cases indicated that in EO mCRC, more patients might benefit from immune checkpoint inhibitors, potentially improving the unfavorable prognosis of EO mCRC patients (24,25). In our pathway analysis, the WNT pathway was the only differently mutated pathway between EO mCRC and AO mCRC cases. The WNT pathway has long been considered a critical driver pathway for the majority of CRCs. Our study found that EO mCRC patients had a significantly decreased mutation rate in WNT pathway genes, and the presence of mutation in the WNT pathway was associated with better overall survival. APC is one of the key members of the WNT pathway and was previously reported to be mutated in approximately eighty percent of colorectal cancers (13). There was evidence showing that APC mutation was a positive prognostic factor in colorectal cancer, especially in proximal tumors, which was consistent with our results (15,26,27). The deregulation of the WNT pathway in EO mCRC patients might account for the unfavorable prognosis to some extent. Thus, more intensive therapies may be applied to EO mCRC patients without WNT pathway mutations to improve their prognosis.
The TGF-b pathway was mutant in thirty percent of EO mCRC cases in our study. TGF-b signaling plays a key role in tumorigenesis by modulating cell growth, differentiation, and apoptosis, and has an important impact on the tumor microenvironment (28). In the consensus molecular subtype (CMS) classification system of colorectal cancer, TGF-b signaling activation was enriched in CMS4 tumors that show upregulation of genes associated with mesenchymal transition or angiogenesis (29). CMS4 tumors also displayed worse overall survival and relapse free survival in comparison with the CMS1-3 subtypes of colorectal cancer (29). TGF-b signaling in the tumor microenvironment promotes T-reg cell infiltration and the activation of cancer-associated fibroblasts, which can accelerate tumor progression and impair anti-tumor immunity (30)(31)(32). Our study demonstrated that alterations in the TGF-b pathway can contribute to aggressive tumor biological characterization and unfavorable outcomes in EO mCRC. Therefore, effective inhibition of the TGF-b pathway could be a pivotal strategy in metastatic EO CRC treatment.
MAPK pathway mutation is one of the most vital drivers of colorectal cancer. Although EO mCRC and AO mCRC patients had comparable mutation rates in the MAPK pathway, unexpectedly, the BRAF V600E mutation was significantly more common in the EO mCRC cohort. BRAF V600E mutation exists in approximately 5-7% of late-stage colorectal cancer as previously reported and was demonstrated to have a significant increase based on patient age in several studies (33)(34)(35). In Giulia et al.'s study, no BRAF V600E mutations were detected in thirty-three EO CRC patients, which was different from our results (36). Furthermore, BRAF V600E mutation is related with MLH1 promoter methylation, which could lead to microsatellite instability (37). In EO mCRC patients, MSI more frequently resulted from Lynch syndrome, which was usually mutually exclusive from BRAF V600E mutation. This could explain the absence of coexistence of MSI-H with BRAF V600E mutant tumors in our EO mCRC patients. Our data suggested that BRAF V600E could be an important driver mutation in Chinese MSS EO mCRC and could distinguish a particular subtype of EO mCRC with poor prognosis.
There were several limitations in our study. As a single-center and retrospective study, potential regional and selection bias may have existed in the patient population. Considering the low frequency of hereditary cancer syndrome, a larger sample size is required for accurately profiling the germline spectrum of early onset metastatic colorectal cancer. Using multigene panel testing, we found several novel germline mutations that have been rarely reported in hereditary colorectal cancer (e.g., ERCC4, SDHA, and XRCC2), but our limited sample size hindered us from determining their penetrance and relationship with colorectal cancer. Although we found various actionable targets in our EO mCRC cohort, the clinical data on corresponding therapies are unavailable to verify the predictive power and clinical significance of these alterations.
In conclusion, 31 of 330 (9.4%) metastatic colorectal cancer patients at or younger than 55 years old carried pathogenic or likely pathogenic cancer susceptibility gene mutations. There were notable differences between the mutation landscape of early onset colorectal cancer and average onset colorectal cancer, which may impact prognosis and response to anti-tumor treatments of early onset colorectal cancer. Identifying hereditary cancer syndrome and therapeutic mutations with next generation sequencing has great practical value for guiding anti-tumor therapies and specialized surveillance of high-risk family members.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: The National Omics Data Encyclopedia, project ID: OEP001101 (https://www. biosino.org/node/project/detail/OEP001101).

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethics Committee of the Peking University Cancer Hospital. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

AUTHOR CONTRIBUTIONS
XW, TX, and YZ designed the project. JZ, CQ, DL, ZW, YL, CJ, and JL participated in patient selection and data collection. TH, LZ and HH-Z carried out next generation sequencing, analyzed and interpreted the data. XL preformed statistical analysis. TX, XW, and LS wrote the manuscript. All authors contributed to the article and approved the submitted version.