Integrated Bioinformatics Analysis Reveals Key Candidate Genes and Pathways Associated With Clinical Outcome in Hepatocellular Carcinoma

Hepatocellular carcinoma (HCC) accounts for approximately 85–90% of all liver cancer cases and has poor relapse-free survival. There are many gene expression studies that have been performed to elucidate the genetic landscape and driver pathways leading to HCC. However, existing studies have been limited by the sample size and thus the pathogenesis of HCC is still unclear. In this study, we performed an integrated characterization using four independent datasets including 320 HCC samples and 270 normal liver tissues to identify the candidate genes and pathways in the progression of HCC. A total of 89 consistent differentially expression genes (DEGs) were identified. Gene-set enrichment analysis revealed that these genes were significantly enriched for cellular response to zinc ion in biological process group, collagen trimer in the cellular component group, extracellular matrix (ECM) structural constituent conferring tensile strength in the molecular function group, protein digestion and absorption, mineral absorption and ECM-receptor interaction. Network system biology based on the protein–protein interaction (PPI) network was also performed to identify the most connected and important genes based on our DEGs. The top five hub genes including osteopontin (SPP1), Collagen alpha-2(I) chain (COL1A2), Insulin-like growth factor I (IGF1), lipoprotein A (LPA), and Galectin-3 (LGALS3) were identified. Western blot and immunohistochemistry analysis were employed to verify the differential protein expression of hub genes in HCC patients. More importantly, we identified that these five hub genes were significantly associated with poor disease-free survival and overall survival. In summary, we have identified a potential clinical significance of these genes as prognostic biomarkers for HCC patients who would benefit from experimental approaches to obtain optimal outcome.


INTRODUCTION
Liver cancer is the fourth leading cause of cancer-related death worldwide and ranks sixth in terms of incidence (Bray et al., 2018;Villanueva, 2019). Among all types of primary malignant liver tumors, hepatocellular carcinoma (HCC) accounts for approximately 85-90% of all cases. The major risk factors including chronic infections by hepatitis B virus (HBV) and hepatitis C virus (HCV), aflatoxin exposure, smoking, type 2 diabetes, obesity, and so on (Marengo et al., 2016;Bray et al., 2018;Phukan et al., 2018). As a highly heterogeneous cancer disease, localized HCC patients often have poor prognosis with 5-year overall survival (OS) rate of 30%, and this rate drops below 5% for those with distant metastases (Oweira et al., 2017). For patients at early disease stages, liver resection is the most effective treatment option, however, only less than 30% of HCC patients are eligible surgery, and among those around 70% eventually relapse within 5 years after treatment (Waghray et al., 2015). Over the past few decades, despite advances in chemotherapy, targeted therapy, radiation therapy, and immunotherapy in the clinical arena, the survival of HCC patients has not significantly increased, and translational studies to understand the mechanisms and prognosis remain underwhelming to design novel therapeutic strategies (Visvader, 2011;Aravalli et al., 2013;Llovet et al., 2018).
Data, information, knowledge and wisdom (DIKW) model has been widely used in life in all aspects including medicine (Song et al., 2018(Song et al., , 2020Duan, 2019a,b;Duan et al., 2019a,b). In recent years, genome-wide profiling has substantially advanced our understanding of the genetic landscape and driver pathways leading to HCC (Totoki et al., 2014;Schulze et al., 2015;Zucman-Rossi et al., 2015;Ally et al., 2017;Villanueva, 2019), revealing Cellular tumor antigen p53 (TP53), Catenin beta-1 (CTNNB1), Axin-1 (AXIN1), Telomerase reverse transcriptase (TERT) promoter and other key genes as driver mutations, and WNT/β-catenin, p53 cell cycle pathway, oxidative stress, PI3K/AKT/MTOR, and RAS/RAF/MAPK pathways as key signaling pathways involved in liver carcinogenesis. However, existing studies have been of limited sample size that failed to create molecular prognostic indices and also the inconsistent computational methods may have restricted the power to identify potential meaningful molecular biomarkers and new therapeutic targets. Therefore, an integrated bioinformatics study combining the most updated genomic data thus providing novel insight into the mechanisms underlying therapeutic resistance and disease progression is highly warranted.
Microarray technology has become an indispensable tool to monitor genome wide expression levels of genes in a given organism and has been successfully used to classify different types of cancer and predict clinical outcomes (Trevino et al., 2007). These microarray technologies have also been applied in many studies to define global gene expression patterns in primary human HCC in an attempt to gain insight into the mechanisms of hepatocarcinogenesis (Crawley and Furge, 2002;Woo et al., 2008;Hoshida et al., 2009;Villanueva et al., 2012;Jin et al., 2015). In the present study, we selected four independent datasets consisting a total of 320 HCC cases and 270 cases of normal liver tissues in the Gene Expression Omnibus (GEO) database to identify reliable markers and pathway alterations linked with the pathogenesis of HCC cases (Wurmbach et al., 2007;Mas et al., 2009;Roessler et al., 2010). We identified 89 differential expression genes (DEGs) including 31 up-regulated genes and 58 down-regulated genes. Gene ontology (GO) analysis revealed cellular response to zinc ion in biological process (BP) group, collagen trimer in the cellular component (CC) group, and extracellular matrix (ECM) structural constituent conferring tensile strength in the molecular function (MF) group. Further pathway enrichment analysis revealed that enrichment in protein digestion and absorption, mineral absorption, propanoate metabolism, and ECM-receptor interaction. Finally, the top five hub genes osteopontin (SPP1), Collagen alpha-2(I) chain (COL1A2), Insulin-like growth factor I (IGF1), lipoprotein A (LPA), and Galectin-3 (LGALS3) were identified from the protein-protein interaction (PPI) network and those highly altered genes were validated by western blot assay and Immunohistochemistry (IHC) analysis and found to be associated with clinical outcome of HCC patients.

Data Source and Identification of DEGs
Microarrays data were obtained from the Oncomine 4.5 database 1 contains 715 datasets and 86,733 samples. Of which, we filtered four datasets comprising Mas liver (GSE14323, containing 19 liver tissues and 38 HCCs), Roessler liver (GSE14520 based on GPL571 platform, containing 21 liver tissues and 22 HCCs), Roessler liver 2 (GSE14520 based on GPL3921 platform, containing 220 liver tissues and 225 HCCs), and Wurmbach liver (GSE6764, containing 10 liver tissues and 35 HCCs) after using the following criteria: (a) Analysis type: cancer vs. normal analysis; (b) Cancer type: hepatocellular carcinoma; (c) Data type: mRNA; (d) Sample type: clinical specimen; (e) Microarray platform: Human Genome U133A, U133A 2.0, or U133 Plus 2.0. A total of 270 cases of normal liver tissues and 320 cases of HCCs were included in the integrated analysis. To analyze the DEGs between HCC and normal liver tissues, the data were then processed on GEO2R website 2 . The differentially expressed genes were identified using limma R package at a cutoff | logFC| > 1 and adjusted p value < 0.05 (Benjamini & Hochberg).

GO and Pathways Enrichment Analysis
The annotation function of GO analysis is comprised of three categories: BP, CC, and MF. Kyoto Encyclopedia of Genes and Genomes (KEGG) is a database resource for understanding high-level functions and utilities of the genes or proteins (Kanehisa and Goto, 2000;Kanehisa et al., 2012Kanehisa et al., , 2016. GO analysis and KEGG pathway enrichment analysis of candidate DEGs were performed using the R package "clusterProfiler."   Reactome 3 was also used for pathway enrichment analysis (Fabregat et al., 2018). Adjusted p-value less than 0.05 was considered as the cut-off criterion for both GO analysis and pathway enrichment analysis.

PPI Network and Modular Analysis
Protein-protein interaction network was constructed to determine the importance of these DEGs by comparing the interactions between different DEGs. STRING database 4 and Cytoscape software (3.7.2 version) were applied to construct and visualize the PPI networks (Szklarczyk et al., 2017), followed by Molecular Complex Detection (MCODE) plug-in in Cytoscape for selecting significant modules of hub genes from the PPI network (Bader and Hogue, 2003), with the following criteria: degree cutoff (number of connections with other nodes) ≥ 2, node score cutoff (the most influential parameter for cluster size) ≥ 2, K-core (This parameter filters out clusters that do not contain a maximally interconnected sub-cluster of at least k degrees. For example, a triangle including three nodes and three edges is a two-core representing two connections per node. Two nodes with two edges between them meet the two-core rule as well) ≥ 2 and max depth (this parameter limits the distance from the seed node within which MCODE can search for cluster members) = 100. KEGG pathway enrichment analysis of the

Hub Gene Selection and Prognostic Analysis
Hub genes were selected based on comparison of top 10 genes ranked by degree and betweenness centrality Network of hub genes. Their co-expressed genes were then analyzed using cBioPortal online platform 6 (Gao et al., 2013). Genetic alterations of these hub genes were explored and compared using the cBioPortal database. Biological process analysis of hub genes was then performed and visualized using plug-in Biological Networks Gene Oncology tool (BiNGO) app in Cytoscape software (Maere et al., 2005). Stage-related information analysis based on gene expression was performed in UALCAN 7 , a comprehensive web resource for analyzing omics data (Chandrashekar et al., 2017). Disease-free survival (DFS) is a concept used to describe the period after a successful treatment of cancer. OS means the length of time from either the date of diagnosis or the start of treatment for HCC. DFS and OS are both measured to see how well a new treatment works. DFS and OS analysis associated with these hub genes were performed using the Kaplan-Meier Plotter online database 8 .   inhibitors (catalog number 11836153001, Roche) and phosphatase inhibitor cocktail (catalog number 78420, Roche)] and centrifuged at 4 • C for 10 min. The protein concentrations of collected supernatants were determined by the BCA protein assay kit (catalog number P0011, Beyotime). Equal amounts of total proteins were separated in 10 or 12% SDS-PAGE and transblotted onto the 0.45 µm PVDF membranes (catalog number 1620177, BIO-RAD). The membranes were blocked in 5% fat-free milk in TBST (150 mM NaCl, 50 mM Tris, pH 7.2) for 1 h at room temperature and subsequently incubated with corresponding primary antibodies as following: anti-SPP1 (catalog number ab8448, Abcam, 1:1000), anti-COL1A2 (catalog number 66761-1-lg, Proteintech, 1:1000), anti-IGF1 (catalog number ab9572, Abcam, 1:1000), anti-LGALS3 (catalog number ab209344, Abcam, 1:1000), anti-GADPH (catalog number 10494-1-AP, Proteintech, 1:10000), and anti-β-Tubulin (catalog number T0023, Affinity, 1:20000) at 4 • C overnight, followed by incubation with a donkey anti-mouse (catalog number C61116-02, LI-COR) or goat anti-rabbit (catalog number C80118-05, LI-COR) secondary antibody for 1 h at room temperature. Then membranes were scanned using the Odyssey infrared imaging system (LI-COR) and the images were captured. The gray levels of the bands were determined by Image J software. The expression of proteins was normalized using the GADPH or β-Tubulin values. The assay was performed three independent times.

Patient Samples
This study was approved by Cancer Hospital of the University of Chinese Academy of Sciences; Zhejiang Cancer Hospital. Twenty formalin-fixed, paraffin-embedded (FFPE) HCC tissues and corresponding adjacent non-cancerous tissues were collected from the Department of Abdominal Surgery, Zhejiang Cancer Hospital. All FFPE HCC tissues were screened by two pathologists independently to confirm the diagnosis of HCC. The most representative tumor and non-cancerous tissues were selected for immunohistochemistry analysis.

IHC Analysis
Neutral 10% buffered formalin-fixed tissue specimens were embedded in paraffin wax and then sliced to 4-micron thick sections by a microtome. In brief, the tissue slices were firstly deparaffinized, followed by rehydration and a 10-min boiling in 10 mmol/L citrate buffer (pH = 6.4) for antigen retrieval. Then, the sections were treated in methanol containing 3% H 2 O 2 for 20 min to inhibit the endogenous tissue peroxidase activity. After being blocked with 1% bovine serum albumin (BSA) at 37 • C for 30 min, IHC staining was carried out for the protein expression of SPP1, COL1A2, IGF1, and LGALS3 using specific primary antibodies at 4 • C overnight, followed by staining with speciesspecific secondary antibodies labeled with horseradish peroxidase (HRP). The slides were developed in diaminobenzidine (DAB)  Genes were ranked by betweenness centrality and degree, respectively. The genes in red are the shared genes with the two analysis methods.
and counter-stained with hematoxylin. Then images of the sections were photographed using an Olympus microscope (Olympus Life Science). The clinical specimen data of LPA were obtained from The Human Protein Atlas database 9 .

Data Source and Analysis
A total of 603, 1,238, 1,095, and 1,722 DEGs have been extracted from the four independent expression datasets Mas liver, Roessler liver, Roessler liver 2, and Wurmbach liver, Darker red indicates increased frequency of alteration in HCC. The blue connection indicates that the first protein controlled a reaction that changes the state of the second protein; the red connection suggests that the proteins belongs to members of the same complex. The green arrows represent "Controls expression of." The gray arrows represent "In complex with." (B) Genetic alteration analysis toward these five genes, overall survivals are also shown.
respectively (Figures 1A-D, Table 1, and Supplementary Tables S1, S2). The comparison of all genes from these datasets identified 89 consistently and significantly dysregulated genes, including 31 up-regulated genes and 58 down-regulated genes in HCC compared to normal liver tissues (Figures 1E,F and Supplementary Table S3). Notably, several genes such as Glypican-3 (GPC3) (Wurmbach et al., 2007) and SPP1 (Roessler et al., 2010) have been reported previously, proving the feasibility of the method.

GO Analysis of DEGs in HCC
The functional characteristics of these 89 DEGs were explored using GO analysis and were grouped into BP, cell component and MF (Figure 2A). Overall, cellular response to zinc ion covering LGAlS3 are overexpressed in HCC tissues while IGF1 and LPA are downregulated in HCC tissues compared to the control.
five genes was found to be the dominant BP. Collagen trimer covering seven genes was found to be the top CC. ECM structural constituent conferring tensile strength covering five genes was the top MF. As shown in Table 2, in the BP group, up-regulated genes were mainly enriched in ECM organization, epithelial tube morphogenesis, and positive regulation of leukocyte migration while down-regulated genes were mainly enriched in cellular response to zinc ion and humoral immune response. In the CC group, up-regulated genes were mainly enriched in collagencontaining ECM and ECM components while down-regulated genes mainly enriched in blood microparticle. In the MF group, up-regulated genes were mainly enriched in ECM structural constituents while down-regulated genes were mainly enriched in pattern binding. Taken together, these data suggest that those identified DEGs are mainly enriched in ECM-related items affecting the BP of negative regulation of growth, humoral immune response and so on.

Signaling Pathway Enrichment Analysis
To understand the biological changes during HCC pathogenesis, we performed pathway enrichment analysis using KEGG and Reactome. KEGG pathways enrichment analysis showed that those candidate DEGs were primarily enriched in protein digestion and absorption, mineral absorption, and ECM-receptor interaction ( Figure 2B). Among them, up-regulated genes were mainly enriched in protein digestion and absorption and ECM-receptor interaction while down-regulated genes were mainly enriched in mineral absorption and metabolic pathways (Table 3). Furthermore, Reactome pathway enrichment analysis showed that the DEGs were enriched in collagen chain trimerization, collagen degradation, metallothioneins bind metals, and response to metal ions (Supplementary Table S4). Among them, up-regulated genes were primarily enriched in collagen chain trimerization, assembly of collagen fibrils and other multimeric structures and collagen degradation, while down-regulated genes were enriched in metallothioneins bind metals, response to metal ions and ficolins bind to repetitive carbohydrate structures on the target cell surface ( Table 4).

Key Candidate Genes and Pathways Identified by DEGs PPI and Modular Analysis
In order to identify key candidate genes, 63 DEGs (24 upregulated genes and 39 down-regulated genes) were filtered into the PPI network complex, including 63 nodes and 78 edges. Among the 63 nodes, only genes ranking in top 10 of both degrees (the number of interactions of each node) and betweenness centrality (degree of impact on interactions between other nodes in the network) parameters were recognized as hub genes. Finally, five genes SPP1, COL1A2, IGF1, LGALS3, and LPA were selected ( Figure 3A and Table 5). Utilizing MCODE plug-in app in cytoscape, two modules were applied for further KEGG pathway enrichment analysis. Module 1 consisted of 5 nodes and 10 edges with genes enriched in protein digestion and absorption, ECM-receptor interaction, amoebiasis and focal adhesion. Module 2 consisted of five nodes and nine edges with the genes mainly enriched in mineral absorption (Figures 3B,C and Supplementary Table S5).

Hub Genes and Associations With Clinical Outcome
The network of hub genes constructed by cBioPortal contained 55 nodes, including five query genes (five hub genes) and the 50 most frequently altered neighbor genes ( Figure 4A). After visualizing BP using BiNGO in Cytoscape software (Supplementary Figure S1), genetic alteration analysis of five hub genes in TCGA HCC patients was performed in the   cBioPortal database. The hub genes SPP1, IGF1, LGALS3, LPA, and COL1A2 were altered in 4, 5, 5, 7, and 8% in a total population of HCC patients respectively, without significantly discrepancy in both sexes ( Figure 4B). TCGA data analysis showed that SPP1, COL1A2, and LGALS3 are more highly expressed in HCC regardless of stages compared with normal tissues, while IGF1 and LPA were low expressed (Figure 5). Further western blot analysis showed that the protein expression levels of SPP1, COL1A2, and LGALS3 were highly expressed in HCC cell lines while IGF1 was down-regulated in HCC cell lines ( Figure 6A). IHC analysis of HCC patient tissues showed similar results as western blot analysis ( Figure 6B). The online human protein atlas showed the LPA protein expression was higher in normal liver tissues than in HCC tissues. For the identified five top hub genes, HCC patients with high expressions of COL1A2, IGF1, and LPA as well as low expression of SPP1 were found to be associated with the improved DFS (p = 0.0017, p = 0.0021, p = 0.0058, and p = 7e −04 , respectively) (Figure 7). High expression of SPP1 and LGALS3 were linked with the disfavored OS (p = 3.5e −06 and p = 0.014, respectively) ( Figures 8A,D), while high expression of IGF1 and LPA were associated with improved OS (p = 0.0013 and p = 0.00038, respectively) (Figures 8C,E). However, expression of COL1A2 didn't show a significant correlation with clinical outcome (p = 0.21) ( Figure 8B).

DISCUSSION
Hepatocellular carcinoma remains an aggressive form of cancer worldwide with high incidence and morbidity. Therefore, substantial efforts have been made to unveil mutational processes, pathogenesis and possible mechanisms underlying treatment resistance in order to expand the therapeutic landscape of this disease (Llovet et al., 2018;Cheng et al., 2019). However, most of these studies were based on single institutions with limited sample size, restricting the power to identify potential meaningful therapeutic targets Li et al., 2019;Zhang et al., 2019). Different HCC studies showed different results for different datasets chosen. In previous studies, some only chose one dataset and others chose datasets without performing explicit infiltration, leading to totally different outcome Li et al., 2019;Zhang et al., 2019). Here, we conducted an integrative analysis from four microarray datasets of HCC screened in Oncomine database and downloaded in GEO database to describe key candidate genes and pathways associated with clinical outcome in HCC patients. In the present study, a total of 89 DEGs were identified between HCC and normal tissues, including 31 up-regulated genes and 58 down-regulated genes. Up-regulated DEGs of HCC were found to be enriched in GO categories such as epithelial tube morphogenesis, ECM organization, and positive regulation of leukocyte migration, and dysregulation of these processes have been found to contribute to several pathological conditions including cancer and may lead to disfavored clinical outcomes (Payne and Huang, 2013;Bonnans et al., 2014). While down-regulated genes were associated with GO categories such as cellular response to zinc ion where members of metallothionein family (MT1M, MT1H, MT1X, MT1G, and MT1F) play important roles in carcinogenesis of various cancer types (Si and Lang, 2018). KEGG pathway enrichment analysis demonstrated that up-regulated genes were significantly enriched in protein digestion and absorption, ECM-receptor interaction and PI3K/Akt signaling pathway while down-regulated genes were enriched in mineral absorption and metabolic pathways, and all those are significant pathways in various cancer types been reported previously (Boroughs and DeBerardinis, 2015;Dimitrova and Arcaro, 2015;Wang S.S. et al., 2017;Slattery et al., 2018). Intriguingly, a host of altered genes were found to be associated with ECM related pathways. The ECM, an extensive part of the microenvironment in all tissues, providing a physical scaffold for its surrounding cells, bind growth factors and regulate cell behavior, plays a vital part in tumor progression (Kalluri, 2016;Nissen et al., 2019).
We also constructed PPI network and identified five hub genes SPP1, COL1A2, IGF1, LPA, and LGALS3 as key candidate genes potentially linked with pathogenesis of HCC. Their coexpressed genes were then analyzed using cBioPortal online platform. The results contained five query genes and the 50 most frequently altered neighbor genes. Among the five hub genes, SPP1, COL1A2, IGF1, and LGALS3 and their co-expressed genes constructed a network. LPA and its co-expressed genes were isolated from the main network and didn't have directly interaction with it. That's why only four out of five hub genes contained in Figure 4A. Both SPP1 and COL1A2 are members belonging to PI3K/Akt signaling pathway and ECM-receptor interaction pathway regulating cell growth (Fang et al., 2017) and drug resistance (Zhang et al., 2016). SPP1, also known as osteopontin, has been reported to have the capability of regulating cell behaviors (Rowe et al., 2014). Previous data also showed that targeting SPP1 could inhibit gastric cancer cell epithelial-mesenchymal transition through inhibition of the PI3K/AKT signaling pathway (Song et al., 2019). In lung adenocarcinoma, SPP1 was found to up-regulate PD-L1 and subsequently facilitated the escape of immunity . These studies demonstrated that SPP1 was highly associated with the cancer invasion and progression, suggesting its potential to serve as a biomarker and target for the diagnosis and treatment of HCC. COL1A2, a member of group I collagen family, has once been reported as a target of Let-7g thus inhibiting cell migration in HCC (Ji et al., 2010) and gastric cancer cell proliferation (Ao et al., 2018). In 2018, a study found that the silencing of COL1A2 could inhibit the proliferation, migration, and invasion of gastric cancer through regulating PI3K/AKT signaling pathway, revealing the potency of COL1A2 in HCC (Ao et al., 2018). IGF1, insulin-like growth factor 1, has the capability of maintaining the stemness in HCC, and its role of serving as an anticancer target has been confirmed by several studies (Kaseb et al., 2011(Kaseb et al., , 2014Chen and Sharon, 2013;Bu et al., 2014). IGF1 and IGF2 comprise of the IGF family, contributing largely to the activation of the PI3K/Akt signaling pathway, which was also found dysregulated by the KEGG analysis, thus enhancing the cancerogenesis of HCC (Kasprzak et al., 2017). LPA is lipoprotein A, a special kind of low-density lipoprotein, has shown the evidence of causing inflammation and regulating HCC cell proliferation (Pirro et al., 2017;Xu et al., 2017). Patients with HCC showed a statistically significant serum LPA level higher than the healthy subjects, indicating its important role in HCC patients (Malaguarnera et al., 2017). LGALS3, which encodes the Galectin-3 protein, is regarded as a guardian of the tumor microenvironment (Ruvolo, 2016). Recent studies have shown that LGALS3 is tightly associated with several malignancies such as Hodgkin's lymphoma (Koh et al., 2014), acute myeloid leukemia (Cheng et al., 2013), and HCC (Song et al., 2014). More importantly, LGALS3 could increase the metastatic potential of breast cancer, might accounting for the metastatic potential of HCC (Pereira et al., 2019). Through validation in western blot and IHC assays, we found that the protein expression of these five hub genes was in accordance with their mRNA expression in HCC patient tissues. Strikingly, LPA has not been tested by western blot and IHC assays due to its large molecular weight of 501 kDa, exerting huge difficulty in performing these assays. Previous studies have shown that these genes are implicated in the tumorigenesis and transformation (Oates et al., 1997;Orsó and Schmitz, 2017;Wang Y.A. et al., 2017;Diao et al., 2018;Ma et al., 2019). In our study, correlations of SPP1, COL1A2, IGF1, LPA, and LGALS3 with patient prognosis highlight the importance of these five genes as potential biomarkers to stratify HCC patients as well as potential therapeutic targets, but concrete roles of these genes need further investigation. In the future studies, we will develop knockdown and overexpression HCC cell lines and mouse models of these five hub genes to demonstrate their importance in the progression of HCC in vitro and in vivo.
Taken together, this study integrated four datasets to screen for reliable and accurate biomarkers of HCC and demonstrated that several pathways are altered. Several hub genes with the expression levels have significantly associated with clinical outcome in HCC patients. Further functional study on the mechanisms of those genes leading to HCC is under way.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: https://www.oncomine.org.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Cancer Hospital of the University of Chinese Academy of Sciences; Zhejiang Cancer Hospital. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
ST, WZ, and J-JQ: conceptualization and supervision. YL, JY, SM, and X-DC: investigations. YL, JY, KQ, CK, and RC: methodology. YL and JY: data curation. YL and RC: original draft writing. YL, RC, and J-JQ: writing review and editing. All authors contributed to the article and approved the submitted version.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene. 2020.00814/full#supplementary-material FIGURE S1 | The biological process analysis of hub genes using BINGO.