Multi-Omics Integrative Analysis Uncovers Molecular Subtypes and mRNAs as Therapeutic Targets for Liver Cancer

Objective: This study aimed to systematically analyze molecular subtypes and therapeutic targets of liver cancer using integrated multi-omics analysis. Methods: DNA copy number variations (CNVs), simple nucleotide variations (SNVs), methylation, transcriptome as well as corresponding clinical information for liver carcinoma were retrieved from The Cancer Genome Atlas (TCGA). Multi-omics analysis was performed to identify molecular subtypes of liver cancer via integrating CNV, methylation as well as transcriptome data. Immune scores of two molecular subtypes were estimated using tumor immune estimation resource (TIMER) tool. Key mRNAs were screened and prognosis analysis was performed, which were validated using RT-qPCR. Furthermore, mutation spectra were analyzed in the different subtypes. Results: Two molecular subtypes (iC1 and iC2) were conducted for liver cancer. Compared with the iC2 subtype, the iC1 subtype had a worse prognosis and a higher immune score. Two key mRNAs (ANXA2 and CHAF1B) were significantly related to liver cancer patients' prognosis, which were both up-regulated in liver cancer tissues in comparison to normal tissues. Seventeen genes with p < 0.01 differed significantly for SNV loci between iC1 and iC2 subtypes. Conclusion: Our integrated multi-omics analyses provided new insights into the molecular subtypes of liver cancer, helping to identify novel mRNAs as therapeutic targets and uncover the mechanisms of liver cancer.


INTRODUCTION
Liver cancer is the fifth largest malignant tumor and the second leading cause of cancer-related deaths worldwide (1,2). It was estimated that 42,220 new cases and 30,200 death cases occurred in the United States in 2018 (3). The mortality of liver cancer accounts for about 6% of death cases of cancers in men and 3% of death cases in women (3). Most patients have advanced liver cancer when first diagnosed. As we all know, several potential risk factors contribute to the occurrence and development of liver cancer, including chronic hepatitis B/C virus infection, alcoholism and aflatoxin exposure (4). Under the exposure of these risk factors, genetics and epigenetic changes may be gradually accumulated, thereby leading to activation of oncogenes and inactivation of tumor suppressor genes, which in turn will lead to the occurrence of liver cancer (5,6). Furthermore, cancers have association with an increased risk of coronary heart disease in time of the first 6 months following diagnoses (7). Despite the considerable progress over the past few decades, the prognosis of patients with liver cancer is still poor (5-year survival rate<20%) due to the high recurrence rate (8). Although extensive research has been conducted on the mechanisms of liver cancer occurrence and development, its etiology and carcinogenesis remain unclear. Considering the high morbidity and mortality of liver cancer, identification of effective markers and exploration of their potential roles have important clinical significance for early diagnosis, prevention, and control of liver cancer.
Growing multi-omics studies have confirmed that genomic and epigenomic imbalances can affect the occurrence and development of liver cancer. TCGA project provides genomic, epigenomic, transcriptomics, and proteomics of 32 human cancers. A number of data portals such as UCSC Cancer Genomics Browser (https://genome-cancer.ucsc.edu/) have been developed (9). As a key regulator of genomic and epigenomic abnormalities, CNV is significantly correlated with individual genetic variations and human genetic diversities, which may change gene expression via modulating mRNA expression and affecting transcription. Several CNVs have been found to be closely related to liver cancer. For example, Jagged1 copy number amplification indicates poor prognosis in patients with liver cancer (10). In-depth research on CNV may help understand the mechanisms and probe susceptible targets for liver cancer. Studies have shown that epigenetic changes such as DNA methylation, contribute to the development of liver cancer (11,12). DNA methylation has been considered as a useful biomarker for early diagnosis of liver cancer. During carcinogenesis, abnormal DNA methylation is mainly manifested by focal methylation around the promoters of specific genes, and global methylation in non-promoter regions (13,14). Hypermethylation of the promoter region is a crucial process that can lead to epigenetic silencing of tumor suppressor genes (15,16). Moreover, abnormal DNA methylation of non-promoter elements is in association with intratumor heterogeneity (17).
Herein CNV, DNA methylation, as well as mRNA levels were detected in a variety of liver cancer samples. Copy number variation-correlated (CNVcor) as well as methylationcorrelated (METcor) genes were identified to distinguish molecular subtypes of liver cancer. Furthermore, specific biomarkers were proposed to drive the classification of these subtypes.

Data Preprocessing
By applying GISTIC2.0, this study calculated the genetic copy number changes for each sample (18). The methylation data preprocessing was as follows. Methylation sites that were undetectable in over 70% of specimens were removed. The KNN was then utilized for filling in missing values. Furthermore, we removed the following methylation data: (1) the methylation data of the SNP sites, (2) the methylation site data on the sex chromosome, and (3) the multi-aligned methylation site data. Methylated sites in the 200 bp range upstream or downstream from gene transcription start were retained in this study. For mRNA expression profiles, this study filtered out mRNAs with FPKM value < 0.1 across 50% specimens.

Correlation Analysis
The correlation coefficient of CNV data or methylation data with gene expression was calculated, which was then transformed to z-value based on ln [(1 + r)/(1r)]. Under the screening criterion of p < 0.01, CNVcor and METcor genes were obtained with the test of correlation coefficient.

Integrative Analysis of CNV, Methylation and mRNA Expression Data
Multi-omics clustering analysis was conducted by integrating CNV, methylation as well as mRNA expression profiles using the non-negative matrix factorization (NMF) package in R (19). Lambda values were used to determine optimal weights for CNV, methylation, and mRNA expression data sets.

Gene Set Variation Analysis (GSVA)
The GSVA algorithm was applied for evaluating the enriched signaling pathways between subtypes based on gene expression profiles (22). The pathway enrichment score of each sample was determined and the differences between subtypes were analyzed by employing the limma package in R (23).

Statistical Analysis
All analyses were carried out using R packages and Graphpad Prism software. ANXA2 and CHAF1B expression was validated in liver cancer and normal tissues using the gene expression data from the International Cancer Genome Consortium (ICGC; http://icgc.org/). Each experiment was repeated three times. Data Frontiers in Medicine | www.frontiersin.org were presented as the mean ± standard deviation. Student's t test was applied for comparisons between two groups. P < 0.05 indicated statistical significance.

Screening CNVCor/METCor Genes in Liver Cancer
Totally, 9161 CNVCor genes were identified (p < 0.01; Supplementary Table 1). As depicted in the distribution of CNVCor genes on chromosomes, CNVCor genes most frequently occurred on chr1 (FDR<0.05; Figure 1A and Table 2). The box plots showed the distribution in pearson correlation coefficients of CNVCor genes on chromosomes ( Figure 1B). 16037 methylation sites corresponding 6181 genes were identified under the screening criteria of p < 0.01 (Supplementary Table 2). As shown in Figure 1C and Table 2, METcor genes were prone to appear on chr1. In the correlation z-value distribution, the correlation between CNVcor gene and its expression leaned to the right, while the correlation between METcor gene and its expression leaned to the left, indicating positive associations between CNVs and gene expressions, while negative associations between methylations and gene expressions ( Figure 1D). METcor genes mainly contained protein-coding genes ( Figure 1E). Furthermore, methylation loci were mainly situated in the island, S shore, N shore, S shelf as well as N shelf regions (Figure 1F). According to the median expression value of CNVCor/METCor genes, the samples were divided into high-and low-groups. Kaplan-Meier survival analysis was then performed. CNVCor genes and METCor genes with p < 0.01 were identified as survival-related CNVCor (n = 745)/METCor genes (n = 344). Two-hundred and fifty-three overlapping CNVcor genes and METcor genes were in significant association with survival of liver cancer (Figure 1G), which were used for further analysis.

Correlations Between CNVs and Methylations in Liver Cancer
We further analyzed the correlations between CNVs and methylations in liver cancer. CNVs were divided into three The results showed that CNV gain was positively correlated to CNV loss (R = 0.14, p = 0.0098; Figure 2A). Furthermore, a strong negative correlation between MetHypo and MetHyper was found in Figure 2B (R = −0.49; p < 2.2e-16). Intriguingly, we found that CNV loss was positively correlated with MetHypo (R = 0.16, p = 0.0029; Figure 2C). However, there were no distinct correlations between CNV loss and MetHyper (Figure 2D), between CNV gain and MetHyper (Figure 2E), between CNV gain and MetHyper ( Figure 2F).

Identification of CNVcor and METcor Gene Molecular Subtypes
NMF method was used for clustering analysis according to CNVcor and METcor genes. The optimal number of clustering was 2 for CNVcor genes (Figures 3A,B) and METcor genes ( Figures 3C,D). Both the CNVcor genes (p = 0.00011) and METcor genes (p < 0.0001) in the two molecular subtypes were in significant association with overall survival of patients with liver cancer (Figures 3E,F). We further compared the differences between CNVcor and METcor gene molecular subtypes. There were high proportions of overlapping samples between CNVcor and METcor gene molecular subtypes ( Figure 3G).

Construction of Two Multi-Omics Molecular Subtypes for Liver Cancer After Integration of CNV, DNA Methylation and mRNA Expression
Based on the CNV data related to the CNVcor genes, the methylation site data related to the METcor genes, and the transcriptome data of the CNVcor and METcor genes, multi-omics clustering analysis was performed using iCluster. The iCluster clustering results showed that the optimal clustering results were 2 groups. iCluster clustering heat maps depicted the distributions of CNVs of CNVcor genes ( Figure 4A) and of methylation sites of METcor genes ( Figure 4B) in two iClusters, respectively. There was significantly difference in overall survival between iC1 and iC2 (p < 0.0001; Figure 4C). There were high proportions of overlapping samples between NMF CNVcor and iCluster CNVcor gene clustering subsets (Figure 4D), between NMF METcor and iCluster METcor gene clustering subsets (Figure 4E), between iCluster CNVcor and iCluster METcor gene subsets (Figure 4F).

Molecular Features of Gene Subtypes in Liver Cancer
We analyzed differences in CNVs (adjusted p < 0.01), methylation (adjusted p < 0.01) and mRNA expression (|FC|>1.5 and FDR<0.05) between iC1 and iC2 subtypes. Venn diagram showed two genes (including ANXA2 and CHAF1B) differed in CNVs, methylation and mRNA expression between iC1 and iC2 subtypes (Figure 6A). A high proportion of ANXA2 gain in iC2 subtype and its loss in iC1 subtype was found in Figure 6B. Hypomethylated ANXA2 more frequently occurred in iC1 and iC2 subtypes ( Figure 6C). Box plots depicted that ANXA2 was significantly up-regulated in iC1 subtype than iC2 subtype (p < 2.22e-16; Figure 6D). High ANXA2 expression significantly indicated a poorer prognosis of liver cancer (p = 0.019; Figure 6E). There was a higher proportion of CHAF1B gain and a lower proportion of its loss in iC2 compared  to iC1 subtype ( Figure 6F). CHAF1B hypermethylation more frequently occurred in iC2 subtype ( Figure 6G). Higher CHAF1B expression was found in iC2 compared to iC1 subtype (p < 2.22e-16; Figure 6H). Its high expression was significantly associated with shorter survival time of patients with liver cancer (p = 0.003; Figure 6I).

Differences in SNVs and Pathways Between Two Multi-Omics Molecular Subtypes for Liver Cancer
Fisher-exact tests were applied for comparing the differences in SNV locus mutation between two subtypes. Seventeen significant mutated sites with adjusted p < 0.01 were screened, as shown  in Figure 7A. We found that iC1 subtype had higher frequency mutations than iC2 subtype. We further assessed the correlation between each SNV locus and expression of ANXA2 and CHAF1B. Tables 3, 4 show the top ten SNV loci for ANXA2 and CHAF1B, respectively. Our findings indicated that these SNV loci might affect expression of ANXA2 and CHAF1B. To explore the differences in biological functions between iC1 and iC2 subtypes, GSVA method was applied. As a result, there were distinct differences in metabolism pathways between subtypes such as taurine and hypotaurine metabolism, sphingolipid metabolism, inositol phosphate metabolism, amido sugar and nucleotide sugar metabolism (Figure 7B).

Validation of ANXA2 and CHAF1B in Liver Cancer Tissues
In the ICGC database, our data confirmed that ANXA2 and CHAF1B were both up-regulated in liver cancer in comparison to normal tissues (Figures 8A,B). Twenty paired liver cancer and normal tissue specimens were harvested in this study. Using RT-qPCR, we validated the mRNA expression of ANXA2 and CHAF1B in liver cancer. The results showed that ANXA2 ( Figure 8C) and CHAF1B ( Figure 8D) were highly expressed in liver cancer compared to normal specimens, which were consistent with bioinformatics analysis results. Consistently, higher ANXA2 ( Figure 8E) and CHAF1B (Figures 8F,G) expressions were found in liver cancer specimens by western blot.

DISCUSSION
Liver cancer is an aggressive malignant tumor and one of the leading causes of tumor-related deaths (24,25). Unfortunately, traditional TNM staging can only stratify patients on the basis of clinical manifestations. Despite advances in treatment strategies, effective molecular targets have not been successfully validated. Hence, there is an urgent need to understand the molecular mechanisms and explore therapeutic targets of liver cancer to improve patients' prognosis. With the advances in sequencing technology, it is accessible to obtain large amounts of high-throughput genome sequencing data. Comprehensive analyses about multi-omics data may help conduct accurate management against liver cancer (26)(27)(28). Thus, in this study, we integrated multi-omics data from 363 patients with liver cancer to establish two molecular subtypes (iC1 and iC2). Compared with the iC2 subtype, the iC1 subtypes had a worse prognosis. These data emphasize the clinical significance concerning multi-omics analyses of CNVs and methylations in liver cancer. We further characterized the immune cell populations of these two liver cancer subtypes. The scores of the six immune cells of the iC1 subtype were significantly higher than those of the iC2 subtype. In addition, mutation profiles showed that the mutation level of iC1 subtype was markedly higher than that of iC2 subtype, which might lead to poor prognosis of iC1 subtype. Some recent studies have shown that genomics, epigenomics, and transcriptomics play a vital role in tumorigenesis and can predict patients' prognosis (29,30). Thus, multi-omics studies can help determine tumor heterogeneity, candidate therapeutic targets, and new mechanisms for cancers (22). By integration of gene expression, CNV gain/loss and hypomethylation/hypermethylation, we identified two prognostic molecular targets, ANXA2 and CHAF1B. Due to the establishment and collection of three data sets and corresponding clinical follow-up information by different organizations, only two overlapping genes in the three data sets may be induced due to internal heterogeneity as well as diversity. These two mRNAs were validated in three independent data sets, suggesting that these genes have universal prognostic significance. Both genes were highly expressed in the iC1 subtype compared to the iC2 subtype. More importantly, their high expression indicated a poorer prognosis. Correlation analysis showed that the mutation site of SNV was significantly correlated with ANXA2 and CHAF1B gene expression. Therefore, assessing the gene expression may be helpful in the diagnoses of early liver cancer. Consistent with previous studies, ANXA2 has been found to be highly expressed in hepatocellular carcinoma (HCC) tissues compared to adjacent normal tissues, furthermore, its high expression is in association with an aggressive phenotype in HCC (31). Highly expressed ANXA2 could induce HCC chemotaxis and metastasis (32), while its knockdown could suppress invasion and migration of liver cancer cells (33). ANXA2 has good diagnostic potential for patients with HBV-related HCC (34). ANXA2 is also involved in the pathogenesis of cardiovascular diseases. For example, both rs11633032 and rs17191344 SNPs can reduce ANXA2 gene expression. Its down-regulation is related to an increased risk of coronary heart disease (35). Also, ANXA2 modulates pulmonary arterial smooth muscle cell proliferation for hepatopulmonary syndrome (36). For CHAF1B, it has been reported that it can promote liver cancer cell migration (37). Thus, in-depth mechanism of these two mRNAs in liver cancer will be probed in further research.
However, several limitations of our study should be pointed out. First, our conclusions were based on retrospective cohorts, and prospective research will be performed to verify these findings. Second, this integrated multi-omics analysis was only based on genomics, epigenomics, and transcriptomics not including proteomics and metabolomics because there were no proteomics and metabolomics data in TCGA database. Third, although our RT-qPCR and western blot results confirmed that ANXA2 and CHAF1B were highly expressed in liver cancer tissues compared to normal specimens, biological functions and mechanisms of ANXA2 and CHAF1B in liver cancer should be further validated.

CONCLUSION
In conclusion, we investigated the possible pathogenesis of liver cancer through multi-omics analysis based on genomics, epigenomics, and transcriptomics. Our results suggested that DNA CNV and methylation may play important roles in liver cancer. Furthermore, we identified two clinically relevant molecular subtypes as well as two key biomarkers for liver cancer. These novel mechanisms and clinical classifications may help develop accurate diagnosis and treatments for patients with liver cancer.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethics Committee of The Third Affiliated Hospital of Chongqing Medical University (2019063). The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
DW conceived and designed the study. YS, WX, and QG conducted most of the experiments and data analysis, and wrote the manuscript. QZ, JY, and CL participated in collecting data and helped to draft the manuscript. All authors reviewed and approved the manuscript.