Somatic Mutation Profiles Revealed by Next Generation Sequencing (NGS) in 39 Chinese Hepatocellular Carcinoma Patients

The features and significance of somatic mutation profiles in hepatocellular carcinoma (HCC) have not been completely elucidated to date. In this study, 39 tumor specimens from HCC patients were collected for gene variation analysis by next-generation sequencing (NGS), and a correlation analysis between mutated genes and clinical characteristics was also conducted. The results were compared with genome data from cBioPortal database. Our study found that T > G/A > C transversions (Tv) and C > T/G > A transitions (Ti) were dominant. The sequence variations of TP53, MUC16, MUC12, MUC4 and others, and the copy number variations (CNVs) of FGF3, TERT, and SOX2 were found to be more frequent in our cohort than in cBioPortal datasets, and they were highly enriched in pathways in cancer and participated in complex biological regulatory processes. The TP53 mutation was the key mutation (76.9%, 30/39), and the most common amino acid alteration and mutation types were p.R249S (23.5%) and missense mutation (82.3%) in the TP53 variation. Furthermore, TP53 had more co-mutations with MUC17, NBPF10, and AHNAK2. However, there were no significant differences in clinical characteristics between HCC patients with mutant TP53 and wild-type TP53, and the overall survival rate between treatment via precision medication guided by NGS and that via empirical medication (logrank p = 0.181). Therefore, the role of NGS in the guidance of personalized targeted therapy, solely based on NGS, may be limited. Multi-center, large sample, prospective studies are needed to further verify these results.


INTRODUCTION
Hepatocellular carcinoma (HCC) is now the fourth most common cause of cancer-related deaths worldwide. Approximately 78,000 patients died from HCC in 2018 (Bray et al., 2018;Forner et al., 2018). Recent next-generation sequencing (NGS)-based studies have uncovered the genetic landscape of HCC (Totoki et al., 2014;Schulze et al., 2015;Cancer Genome Atlas Research, 2017), including driver mutations in TP53, CTNNB1, TERT promoter, and other key gene loci.
However, how genetic alterations drive the occurrence and development of HCC remains largely unknown.
As a high-throughput sequencing technique, NGS can perform multiple typological analyses on thousands of genes. The main purpose of NGS is to find the main driver gene in patients with advanced cancer and carry out targeted therapy, as well as to try to discover the molecular mutation target of drug resistance (Deng et al., 2019). An increasing number of clinical studies have shown that the analysis of comprehensive characterization of genome changes has clinical benefits for cancer patients (Takeda et al., 2015;Staaf et al., 2019). However, there are still many unknown pathogenic variants waiting to be discovered. Identification of these alterations in cancer patients is the first step toward providing therapeutic targets.
Herein, we characterized differences of the genomic profiles between HCC patients in our cohort and HCC patients in the cBio Cancer Genomics Portal (cBioPortal, http://cBioPortal. org) database using six datasets ( (Gao et al., 2013). We also explored the correlations between high-frequency mutated genes and clinical characteristics of patients, and compared the efficacy between precision medication guided by NGS and empirical medication.

Patients and Tissue Acquisition
A total of 39 HCC samples were collected for targeted panel or wholeexome sequencing between 2014 and 2019 at the First Affiliated Hospital of Sun Yat-sen University. After obtaining the approval of the Ethics Committee, written informed consent was obtained from all patients. The study inclusion criteria were as follows: 1) age at diagnosis was more than 18 years; 2) HCC samples were confirmed by pathological diagnosis; 3) patients underwent hepatectomy as treatment. The exclusion criteria included the following: 1) patients having other types of malignant tumors in addition to HCC; 2) severe organ damage, autoimmune diseases, and mental illness. In addition, patients were grouped according to the Barcelona Clinic Liver Cancer (BCLC) staging system (Forner et al., 2018). Tumor pathological grade was based on the Edmondson-Steiner Grading System (Edmondson and Steiner, 1954).
Tumor samples were collected immediately following surgical resection, and then stored in pre-cold RPMI-1640 medium with 5% FBS and 1 × Penicillin/Streptomycin, or in Histidine-Tryptophan-Ketoglutarate tissue preservation solution if the estimated shipping time was longer than 1 h. Formalin-fixed paraffin-embedded (FFPE) sections of surgical tumor samples were also sent for analysis when fresh tumor samples were unavailable. Samples were anonymized for further analysis.
After discharge, patients were seen in the clinic monthly for the first 6 months, and then every 3 months, as described in our previous study (Ke et al., 2020). Telephonic follow-up was also conducted every 6 months. The diagnosis of tumor recurrence was made based on clinical examination, laboratory data, and radiological examinations (such as MRI, CT, and positron emission tomography [PET] scan).

Targeted Panel Sequencing, Whole-Exome Sequencing and cBioPortal Database Analysis
The panel of targeted deep sequencing comprised 4,557 exons of 365 tumor-associated genes, and 45 introns from 25 genes where frequent gene fusions could be captured in cancer (Supplementary Figure S1). All targeted panel sequencing assays were performed at the 3D Med Clinical Laboratory Co., Ltd. (Shanghai). The detailed method used to perform targeted deep sequencing has been described elsewhere (Feng et al., 2020). All whole-exome sequencing assays were performed at the GenomiCare Medical Laboratory Co., Ltd. (Shanghai). The process of whole-exome sequencing included the following: 1) exome capture, library construction, and sequencing; 2) sequence mapping and somatic variant detection; and 3) detection of copynumber alterations, which have been described in detail elsewhere (Tan et al., 2016;Yang et al., 2019).
We further used the online analysis tool of the cBioPortal database to explore the differences of mutation profiles between our cohort and cBioPortal datasets. The correlations between the high-frequency mutation gene and clinical characteristics were also analyzed.

Statistical Analysis
Statistical analyses for clinical data and mutation profiles were performed using SPSS Statistical software, version 25.0 (IBM, Chicago, Illinois, United States) and Excel 2019. Unordered categorical variables were analyzed by Fisher's exact or Chi-Square test, and ordinal or continuous variables were analyzed by non-parametric Mann-Whitney U test. Correlations were analyzed to identify clinical characteristics related to mutation profiles. Mutation frequency of gene the number of patients with gene mutation/total number of patients ×100%. Overall survival (OS) was defined as the time from the date of surgery until death or last followup, and disease-free survival (DFS) was defined as the time from the date of surgery to initial tumor recurrence, metastasis, or death. The last follow-up was conducted in August 2021. The survival analysis was conducted using the Kaplan-Meier method and compared via log-rank test. A two-sided value of p < 0.05 was considered to be statistically significant.
To understand the biological characteristics of the mutated genes, we performed enrichment analysis, which included Gene Ontology (GO) function and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. KEGG items revealed that mutated genes were highly enriched in multiple cancer pathways ( Figure 3A). With regard to HCC, the cancer we focused on, its enrichment ratio was 11.59% (q 0.002). GO items demonstrated that mutated genes were mainly involved in glycoprotein metabolic and biosynthetic processes in biological processes ( Figure 3B); extracellular matrix and Golgi lumen in cellular components ( Figure 3C); and extracellular matrix structural constituents and phosphatase binding in molecular functions ( Figure 3D). Figure 3E shows that some variant genes were related to complex cancer pathways. These genes were mainly involved in the JAK/STAT, PT3K/AKT, WNT, and MAPK/ERK pathways, and could influence each other (e.g., in terms of activation, inhibition, and phosphorylation), which could lead to cell evading apoptosis, cell proliferation, sustained angiogenesis, etc. and in turn affect the occurrence and development of cancers. According to the data obtained from targeted panel and whole-exome sequencing reports, 59.0% (23/39) patients had at least one clinically actionable somatic mutation for which clinical treatments could be prescribed using precision medicine (Supplementary Tables S2, S4).
We further used cBioPortal to analyze the variated types of top 4 mutated genes in our cohort. We found 209 missense mutations, 93 truncating mutations, and 5 in-frame mutations in TP53; 136 missense mutations, 15 truncating mutations, and 1 in-frame mutation in MUC16; 15 missense mutations and 2 truncating mutations in MUC12; and 50 missense mutations, 2 truncating mutations, and 2 in-frame mutations in MUC4 (Supplementary Figure S4

Correlation Analyses Between Gene Mutation and Clinical Characteristics
We used cBioPortal HCC cohorts to analyze the correlations between TP53 mutation and clinical characteristics (Supplementary Table S7) and found that only neoplasm histologic grade (q 0.008) and race category (q 0.003) had a significant association with TP53 mutation. With regard to OS and DFS, the survival differences between the TP53 mutation group and the wild-type group were significant in the cBioPortal dataset (OS: logrank p 0.018; DFS: logrank p 0.005) (Supplementary Figure S5). However, the results were different from our study which showed that there were no significant differences in survival outcomes (OS, logrank p 0.084; DFS, logrank p 0.201) as well as other clinical characteristics between the TP53 mutation group and the wild-type group. Cirrhosis tended to occur in FGF3 and MUC4 mutation groups (p 0.019 and 0.011, respectively) ( Table 4).
No significant statistical differences were observed between precision medication guided by NGS and empirical medication (logrank p 0.181), especially between targeted therapy based on recommended drugs and clinical experience (logrank p 0.376) ( Figure 4). However, immunotherapy combined with targeted therapy seemed to result in a longer OS rate, even if there was no statistical difference.

DISCUSSION
In this study, we used NGS to detect multi-gene variations in HCC patients, analyzed the correlations with clinical characteristics, and compared our findings with those of the cBioPortal database.
First, we described the overall situation of somatic mutations. C > A/G > T Tv and T > C/A > G Ti were moderate in our study, and were shared by other HCC cohorts (Totoki et al., 2014;Schulze et al., 2015;Fujimoto et al., 2016). T > G/A > C Tv and C> T/G > A Ti patterns were dominant, but the proportion of T > A/A > T Tv and C > G/G > C Tv was the lowest, implying that T > G/A > C Tv and C > T/G > A Ti may have contributed to hypermutations in our cohort, but these results were different from two previous studies (where T > A/A > T Tv was dominant) Zhou et al., 2019) and the Cancer Genome Atlas (TCGA) dataset (where T > G/A > C Tv showed the lowest occurrence) (Supplementary Figure  S6), which may be the reason of the complexity of the genome, individual differences and small sample size. In addition, a relatively high ratio of Ti/Tv was found, in agreement with the results of previous HCC sequencing studies (Guichard et al., 2012;Huang et al., 2012) and other cancers studies (Moore et al., 2003;Hainaut and Pfeifer, 2016). Therefore, a high ratio of Ti/Tv in our study may have contributed to the biochemical structure of nucleotides and the chemical characteristics of complementary base pairing (Taylor et al., 2006;Massey, 2015;Stoltzfus and Norris, 2016), which could help researchers gain a deeper understanding of the patterns and strengths of molecular system development and HCC evolution. Second, we analyzed the main somatic gene variations and found that most of high-frequency mutations in our cohort were relatively low-frequency mutations in cBioPortal datasets. The top 4 mutated genes (TP53, MUC16, MUC12, and MUC4) were dominated by missense mutations, which was similar to the cBioPortal data. Further, TP53 mutations were the most frequent mutation in both our cohort (p.R249S was the most common amino acid alteration) and cBioPortal datasets, even if there was a significant difference in the mutation rate (76.9% vs 29.0%, respectively). In addition, the most common changes in CNV were FGF3, TERT, and SOX2, and their variant rates were all higher than those reported in cBioPortal-HCC patients (15.4% vs. 5.0%; 12.8% vs. 4.0%; 12.8% vs. 1.1%, respectively). We speculated that because of ethnic and individual differences, the genetic profile characteristics of HCC patients in China may be different from those in other countries (three datasets from the United States, one dataset from Europe, one dataset from Korea, the other dataset from Japan in cBioPortal database). Accordingly, large cohort studies are needed to verify these results.
We further explored whether mutant genes were related to clinical characteristics. TP53 mutation had significant correlations with histological grade, race category, OS, and DFS in cBioPortal database. However, our results suggested that there were little correlations between gene variations and clinical characteristics except that cirrhosis tended to occur in FGF3 and MUC4 mutation groups. We further found that the effect of treatments guided by NGS may be limited. There may be several reasons for this difference. First, individual differences, racial disparities, and sample sizes could have affected the results. Next, gene mutations (e.g., nonsense mutation) may not affect the protein expressions, which play a significant role in performing life functions. Further, co-occurring genetic alterations could alter the biological characteristics of tumors and affect the prognosis of patients (Deng et al., 2019), meaning that different genetic mutations may affect each other. Furthermore, enrichment analysis showed that the mutated genes were involved in complex cancer signaling pathways (e.g., PI3K/AKT, WNT, and JAK/STAT pathways), biological processes (e.g., glycoprotein metabolic process, protein glycosylation, and activation of innate immune response), cellular components (e.g., extracellular matrix, Golgi lumen, and nuclear chromosome part), and molecular functions (e.g., extracellular matrix structural constituent, phosphatase binding, and protease binding). When targeted drugs act on HCC cells, tumor cells can change the expression of related proteins, adjust the connection of signal pathways, and change the microenvironment to evade targeted drug attacks. When a pathway is inhibited by targeted drugs, HCC cells can strengthen the signal transduction of other pathways by compensation, thereby re-promoting its own proliferation and invasion, leading to the failure of targeted therapy (Mir et al., 2017). Some targeted drugs can inhibit the angiogenesis of HCC tissues, but a continuous anti-angiogenesis effect can cause tumor starvation and hypoxia, promoting the proliferation of resistant HCC cells that adapt to hypoxia and lack of nutrients (Mendez-Blanco et al., 2018). Thus, intervention of a signaling pathway alone may be ineffective, and the negative feedback may result in the development of drug resistance. Accordingly, the use of several molecularly targeted agents in combination is an appealing way to counteract resistance. Finally, insignificant statistical differences may also be caused by the relatively small sample size. Multi-center, large sample, prospective studies are needed to further verify these results. No therapeutic targets in many patients suggested that HCC is not completely caused by mutations, or that there are no approved drugs targeting these mutations. Moreover, targeted drugs may be invalid. SHIVA, a randomized trial conducted in France, found that there were no differences between NGSguided treatment and conventional treatment in terms of PFS and OS (Le Tourneau et al., 2015). In addition, tumor mutation burden (TMB) can also fail to predict immune checkpoint blockade response (McGrail et al., 2018;McGrail et al., 2021). Therefore, the out-of-range use of NGS for targeted drugs should be focused on.
Further, the results of gene sequencing may vary considerably. For instance, different institutions may provide different results for gene sequencing, which may result from discrepancies in sequencing principles, sequencing systems, and bioinformatics algorithms, etc. Problems in the gene sequencing process (such as hardware, software, samples, and quality control) can also lead to false negatives or false positives (Xuan et al., 2013;Bean et al., 2020). Moreover, the different understandings of genes or treatments with potential clinical benefits may lead to different interpretations of the same test results (Rehm et al., 2013). Most institutions only rely on public databases to interpret data and recommend targeted drugs, but they fail to conduct individualized analysis based on patientspecific conditions. Therefore, some treatments, which are based on clinical experience rather than gene sequencing, may also be effective. This phenomenon can explain why precision medication guided by NGS was not superior to empirical medication in terms of OS rate in our study. It is worth discussing whether better the results can be obtained with more gene sequencing. If gene sequencing can only help a small number of patients, the incremental cost will be high when it is promoted. In addition, the results of gene sequencing could be useless for treatment if they are not sufficiently correlated with important clinical data (such as tumor size, family history, and drug use). Therapies only based on some gene signaling pathway theories and little literature evidence alone will hardly have any positive effects. Therefore, gene sequencing may not be translated into improved patient outcomes and the detection of therapeutic gene mutations could be far from having a true clinical benefit. Some studies have reported that patients achieved good curative effects by implementing targeted therapy based on gene sequencing, but the sample size, methodology, and research design were not rigorous and the effective rate was also not mentioned in these studies (Yu et al., 2018;Sun et al., 2020). The effective rate of even programmed death 1 (PD-1) treatment was only 17%-20% (El-Khoueiry et al., 2017;Zhu et al., 2018;Lee et al., 2020). Nevertheless, targeted therapy-combined immunotherapy could improve efficacy, not only in our results but also in other studies (Finn et al., 2020;Xu et al., 2021). Therefore, different patients should choose different gene sequencing based on individual differences and genetic polymorphisms. Molecular biology experts, pathologists, oncologists, bioinformatics experts, and immunology experts should work together to find the best-matched therapeutic drugs and conduct cutting-edge clinical trials for each mutation site so as to provide a comprehensive interpretation of the genetic sequencing report for cancer patients. At the same time, researchers should perform reasonable clinical research, strictly define the outcome of clinical benefit, and prospectively evaluate the efficacy of targeted drugs under the guidance of gene sequencing.
Our study has several strengths. First, we described the somatic mutations profiles and identified the high-frequency variated genes in 39 Chinese HCC patients. Second, similarities and differences were revealed between our HCC cohort and cBioPortal-HCC patients with regard to genomic profiling, especially those genes that were relatively lowfrequency in the cBioPortal database but commonly mutated in our cohort. Third, the correlations between gene mutation and clinical characteristics were also analyzed, and its limited values for guiding the clinical work were indicated. However, there are several limitations to our study. First, the sample size of the group was small. Accordingly, large umbrella trials of personalized precision therapy are needed to confirm our findings. Second, we did not perform multiple sequencing methods (such as transcriptomics, proteomics, and metabolomics), cell-and animal-based experiments to further verify the results. Third, the combination of the two sequencing methods may be confusing. For the reason of timeliness, we initially used targeted panel sequencing, and later adopted whole-exome sequencing for a larger genome screen. We wanted to expand the sample size so that the data can be fully utilized. In addition, samples are also being accumulated in our center to further verify our research results. Despite these limitations, this study reflected realworld clinical practice as it related to personalized targeted therapy guided by NGS in patients.
In conclusion, the characteristic somatic mutation profiles in 39 Chinese HCC patients were described in this study. Further, we conclude that the role of NGS in guiding treatment may be limited.