Identification of Prognostic Biomarkers and Correlation With Immune Infiltrates in Hepatocellular Carcinoma Based on a Competing Endogenous RNA Network

Background Hepatocellular carcinoma (HCC) is one of the most common malignant tumors worldwide. Recently, competing endogenous RNAs (ceRNA) have revealed a significant role in the progression of HCC. Herein, we aimed to construct a ceRNA network to identify potential biomarkers and illustrate its correlation with immune infiltration in HCC. Methods RNA sequencing data and clinical traits of HCC patients were downloaded from TCGA. The limma R package was used to identify differentially expressed (DE) RNAs. The predicted prognostic model was established using univariate and multivariate Cox regression. A K-M curve, TISIDB and GEPIA website were utilized for survival analysis. Functional annotation was determined using Enrichr and Reactome. Protein-to-protein network analysis was implemented using SRTNG and Cytoscape. Hub gene expression was validated by quantitative polymerase chain reaction, Oncomine and the Hunan Protein Atlas database. Immune infiltration was analyzed by TIMMER, and Drugbank was exploited to identify bioactive compounds. Results The predicted model that was established revealed significant efficacy with 3- and 5-years of the area under ROC at 0.804 and 0.744, respectively. Eleven DEmiRNAs were screened out by a K-M survival analysis. Then, we constructed a ceRNA network, including 56 DElncRNAs, 6 DEmiRNAs, and 28 DEmRNAs. The 28 DEmRNAs were enriched in cancer-related pathways, for example, the TNF signaling pathway. Moreover, six hub genes, CEP55, DEPDC1, KIF23, CLSPN, MYBL2, and RACGAP1, were all overexpressed in HCC tissues and independently correlated with survival rate. Furthermore, expression of hub genes was related to immune cell infiltration in HCC, including B cells, CD8+ T cells, CD4+ T cells, monocytes, macrophages, neutrophils, and dendritic cells. Conclusion The findings from this study demonstrate that CEP55, DEPDC1, KIF23, CLSPN, MYBL2, and RACGAP1 are closely associated with prognosis and immune infiltration, representing potential therapeutic targets or prognostic biomarkers in HCC.


INTRODUCTION
Hepatocellular carcinoma (HCC) is one of the most universally malignant tumors in the world, with increasing morbidity and mortality (Shi et al., 2016;Li et al., 2017). Currently, it is the fifth most common cancer and the fourth leading cause of cancer-related death worldwide (Long et al., 2019;Gu et al., 2020). Tumorigenesis of HCC is correlated with several liver primary diseases, such as viral infections, including Hepatitis B virus (HBV), Hepatitis C virus (HCV), and other kinds of hepatotropic viruses, as well as alcoholic liver diseases, dietary aflatoxin exposure diabetes, and other diseases (Fu et al., 2018;Li et al., 2019a). Despite continuous improvement in the methods of diagnosis and treatment, HCC remains a global clinical challenge due to its poor prognosis and low rate of 5-year survival (Yue et al., 2019;Zhang R. et al., 2020). Therefore, individual strategies based on identifying early potential prognostic biomarkers and novel therapeutic targets are urgently needed.
The correlation between protein-coding messenger RNA (mRNA) and non-coding RNA (ncRNA), including long non-coding RNAs (lncRNA) and microRNAs (miRNAs), is complicated and obscure. In 2011, the competing endogenous RNA (ceRNA) hypothesis was elucidated for the first time by Salmena et al. (2011) demonstrating that ncRNAs not only directly take part in the regulation of targeted gene expression but also absorb corresponding miRNAs as natural sponges due to their typically containing more than one miRNA response element (MRE) that competes with mRNA. Recently, increasing evidence indicates that the regulatory network comprised of lncRNA-miRNA-mRNA plays an important role in the physiology and development of various tumors, including HCC, gallbladder cancer, gastric cancer, and others (Long et al., 2019;Yue et al., 2019;Gu et al., 2020). Zhang et al. (2016) indicated that lncRNA-correlated ceRNA networks are involved in diverse biological cancer pathways in glioblastoma. Wang J.J. et al. (2019) identified six lncRNAs, including LINC00536 and MIR7-3HG, that have a significant effect on overall survival in breast cancer. Nevertheless, current studies based on ceRNA networks in multiple databases for HCC are insufficient.
Recently, increasing attention has been paid to immunotherapy research in various cancers, especially in advanced stages, including for mesothelioma, HCC, and others. However, the benefits from immunotherapies are diverse in various tumors and are difficult to evaluate due to a lack of trustworthy immune-related biomarkers (Pan et al., 2019;Chen et al., 2020). Immune-related cells infiltration into the tumor microenvironment (TME) is a key reason leading to immune responses at primary and secondary tumor sites, which is tightly regulated by various mediators, such as chemokines. Several studies have indicated that a variety of different immune cells, including CD4 + and CD8 + T -cells, dendritic cells, and tumor-associated macrophages (TAMs), have been identified in different cancers, such as prostate cancer, HCC and others (Pan et al., 2019;. Chen et al. (2015) also demonstrated that CD4 + and CD8 + T-cells could be recruited into the TME after CXCR4 inhibition in sorafenib-treated HCC in a mouse model. Therefore, there is an impending requirement to identify potential predictors related to immune cell infiltration to enhance the efficacy of individual immunotherapeutic treatment in tumors.
Study design is recapitulated in Figure 1. First, differentially expressed RNAs were analyzed in 371 cases of HCC and 50 normal liver tissues from The Cancer Genome Atlas (TCGA). Subsequently, a nomogram predicted model based on 23 miRNAs was established and revealed high performance. Next, we constructed a ceRNA network composed of 56 DELs, 6 DEMs, and 28 DEGs to illustrate preliminary interactions between mRNAs and ncRNAs. DEGs correlating to the ceRNA network were submitted for Gene Ontology (GO) and pathway enrichment analysis to clarify the underlying molecular mechanism in HCC. Finally, six hub genes, including CEP55, DEPDC1, CLSPN, KIF23, MYBL2, and RACGAP1, were identified by proteinto-protein (PPI) analysis and were all overexpressed in HCC samples with significant poor prognosis in HCC patients. Meanwhile they were also closely associated with immune infiltration in HCC. In summary, we believe these genes represent potential prognostic markers, immune-related biomarkers and immunological therapeutic targets for HCC treatment that deserved to be further explored in the future.

Identification of Differentially Expressed (DE) RNAs
The expression profile of RNA sequencing data retrieved from TCGA was analyzed using the limma package of R software 2 with the criterial of |log 2 fold change| > 2 and the adjusted false discovery rate (FDR) of P < 0.05. Screened DE RNAs, including differentially expressed lncRNAs (DELs), differentially expressed miRNAs (DEMs), and differentially expressed mRNAs (DEGs), were used for subsequent analysis. Heat maps and volcano plots for DE RNAs were created using the heatmap package of R software.

Univariate and Multivariate Cox Regression Analysis
Univariate Cox regression was used to screen for potential prognostic miRNAs correlated with overall survival in HCC patients. The least absolute shrinkage and selection operator (LASSO) detects the most influential variables because it analyzes all independent variables simultaneously. According to the principle of a penalty following a regularization path, the coefficients of less influential variables would trend toward zero. The glmnet package was used to perform LASSO algorithm with the criterial of P < 0.05. Multivariate Cox regression analysis via survival R package was utilized to establish a prognostic predictive model visualized by nomogram to show the correlation of the expression of DEMs and survival rate of specific HCC patients. The forest plot was created to display the results of multivariate Cox regression using the forestplot package.

Evaluation of miRNA-Based Clinical Predictive Model
To evaluate the predictive performance of the prognostic model based on DEMs, first, a calibration curve of 3-and 5-year survival rates was determined to assess agreement between the predictive model and actual survival time. Moreover, the area under the curve (AUC) was calculated according to the time-dependent receiver operating characteristic analysis (ROC). Additionally, the risk score formula was performed to calculate total risk scores for individual patients based on the coefficient for each DEM. The risk score formula was built according to the following method: total risk score = sum of each coefficient × transcriptional expressed value of DEM. Then, HCC patients were divided into high-and low-risk groups by the median risk scores, regarded as the cutoff value. The difference in survival rate between the two groups was also evaluated. The correlation between expression levels of DEMs and OS in HCC patients was calculated by Kaplan-Meier (K-M) survival analysis using the survival package of R software according to the X tile method with a cutoff P-value < 0.05.

Establishment of the ceRNA Regulatory Network
A co-expressed regulatory network comprised of DELs, DEMs, and DEGs was established to explore the potential functions of these DE RNAs in HCC. The interaction between DEMs and DELs was confirmed using the miRcode database 3 , which not only includes putative target sites of miRNAs from the integrated and searchable map but also contains conserved microRNA families annotated by the ENCyclopedia of DNA Elements (ENCODE) (Jeggari et al., 2012). DEM targets were predicted from three databases, including miRDB 4 , miRTarBase 5 , and TargetScan 6 (Agarwal et al., 2015;Wong and Wang, 2015;Chou et al., 2018). Overlapping DEGs were selected for constructing the ceRNA network. Cytoscape 7 software was used to visualize the expression correlation of DE RNAs.

Functional Annotation and PPI Network Analysis
Gene Ontology (GO) analysis of differentially expressed genes, including biological process (BP), molecular function (MF) and cell components (CC), and Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis, were enriched using the Enrich online tool 8 . The Reactome pathway was determined by the Reactome website 9 . The online STRING 10 tool was used to construct the protein-to-protein (PPI) interaction network for DEGs involved in the ceRNA network and was visualized by Cytoscape. Hub genes were defined as the top six genes with the highest degree of connections to others via the CytoHubba plug-in of Cytoscape.

Correlation of Hub Genes and Immune Infiltration Analysis
TIMMER is a comprehensive online database used for systematic analysis of the correlation of immune infiltration and gene markers of interest in 32 cancer types from TCGA 11 . The abundance of tumor-infiltrating immune cells (TIICs) from gene profiles is evaluated based on the statistical method of deconvolution published previously (Aran et al., 2015;Li et al., 2016). We analyzed expressed levels of hub genes in various tumors and the relationship between the expression of hub genes and immune infiltration, including B cells, CD4 + T cells, CD8 + T cells, neutrophils, macrophages, and dendritic cells. Furthermore, the correlation of hub gene expression and gene markers associated with tumor immune infiltration of monocytes, tumor associated macrophages (TAMs), M1 and M2 macrophages was performed via correlation modules. These gene markers were identified in previous studies (Sousa and Maatta, 2016;Danaher et al., 2017;Siemers et al., 2017). The strength of the correlation was evaluated by Spearman's algorithm divided into five levels: very weak (0.00-0.19), weak (0.20-0.39), moderate (0.40-0.59), strong (0.60-0.79), and very strong (0.80-1.0). The cutoff criteria for statistical significance was P-value < 0.05. In addition, the SCNA module was used to identify differences in tumor infiltration of hub genes in HCC with different somatic copy number alterations.

Identification of Differentially Expressed RNAs
A total of 371 HCC samples and 50 non-tumor tissues were included in this study. Differentially expressed RNAs were analyzed using the limma R package with a screening cutoff threshold of |log 2 -fold change| > 2 and an adjusted P-value < 0.05. 1999 DEGs, 251 DEMs, and 1092 DELs were identified as differentially expressed RNAs. Among them, there were 1794 DEGs, 229 DEMs, and 1034 DELs upregulated. Moreover, differential expression of DEGs, DEMs and DELs is displayed by hierarchical clustering and volcano plots (Figure 2).

Establishment of miRNA-Based Prognostic Predictive Model in HCC
A total of 42 DEMs survival-related miRNAs were screened by univariate Cox regression from 251 DEMs identified in the HCC cohort (Supplementary Tables 1, 2). Next, these miRNAs were included in the LASSO analysis to calculate the corresponding coefficients. Twenty-three DEMs significantly correlated with survival were selected out (Supplementary Figure 1 and Supplementary Table 3). A simple-to-use nomogram predictive model was established to describe correlation of the expression of each miRNA and the 3-and 5-year overall survival rate of HCC patients based on multivariate Cox regression (Figure 3). Meanwhile, the 3-and 5-year calibration curves were drawn, showing good consistency between predicted survival probability and the actual survival rate (Figures 4A,B). The ROC curve also exhibited great reliability for the nomogram prediction model in discriminating tumors from normal tissues with the area under curve (AUC) of 3-and 5-year being 0.804 and 0.744, respectively ( Figure 4C). Furthermore, the risk score (RS) of each patient in the HCC cohort was calculated, and the patients were subsequently divided into high-risk and low-risk groups according to the mean RS. K-M analysis indicated that patients in the high-risk group exhibited decreased survival compared to the low-risk group, with a log-rank P-value < 0.05 ( Figure 4D).

Construction of the lncRNA-miRNA-mRNA Regulatory Network
K-M analysis performed for 23 DEMs significantly related with OS revealed that 11 DEMs, including 8 that were upregulated, were independently statistically significant (Supplementary Table 4). Results showed that HCC patients with highly expressed DEMs had a shorter survival time than those with lower expression with a P-value < 0.05 (Supplementary Figure 2). Recently, increasing evidence has indicated that miRNAs play a significant role in the development and metastasis of tumors. To reveal potential signaling pathways regulated by these DEMs in HCC, the DIANA-miRPath database was exploited and revealed enrichment of cancer-related signaling pathways, such as PI3K-AKT, NF-kappa B, VEGF, and others (Supplementary Figure 3).
The lncRNA targeted by the 11 DEMs was screened based on the interactions with 1092 DELs aforementioned. Six DEMs were targeted by 56 DELs according to the miRcode database. Next, targets of the 6 DEMs were predicted using miRTarBas, miRDB, and TargetScan databases. The overlapping 574 mRNAs predicted in all three databases were further intersected with 1999 DEGs identified in the HCC cohort, and only 28 DEGs existed in both groups (Supplementary Figure 4 and Supplementary Table 5). The representative interactions among 56 DELs, 6 DEMs, and 28 DEGs are summarized in Supplementary  Table 6, and the ceRNA regulatory network, including gene nodes and preliminary interactions, was visualized using Cytoscape software ( Figure 5A).

Function and Pathway Enrichment Analysis of DEGs Involved in the ceRNA Network
Gene ontology and pathway enrichment analyses were performed to elucidate the functions of 28 DEGs in the ceRNA network correlated with the progression of HCC. Functional annotation of biological process (BP), cellular component (CC), and molecular function (MF), as well as KEGG pathway enrichment, were performed on the Enrichr comprehensive database, in which the top 10 highly enriched items for BP, CC, MF, and KEGG pathway are shown based on a P-value < 0.05 (Figures 5B-E). Notably, all top 10 items were closely related to cancer-related pathways, such as TNF signaling, breast cancer, small cell lung cancer and others. Moreover, Reactome pathway analysis was also developed to identify possible metabolic pathways in which the 28 DEGs are involved. A total of 28 pathways were identified, and the top 15 highly enriched pathways are presented in Figure 5F.

Construction of PPI Network and Identification of Hub Genes
To further investigate the function of 28 DEGs associated with the ceRNA network at the protein level, we established a proteinto-protein interaction (PPI) network composed of 109 nodes and 218 degrees to visualize detailed interactions ( Figure 6A). Considering the significance of hub genes in the ceRNA network, the CytoHubba plugin in Cytoscape software was exploited to identify hub genes by evaluating the number of degrees and connections. Finally, six hub genes, CEP55, DEPDC1, MYBL2, RACGAP1, CLSPN, and KIF23, were identified, which were all upregulated in HCC cohort ( Figure 6B). The filled color of nodes from red to yellow indicates the degree of connectivity of hub genes with others gradually decreases. GO and pathway enrichment, including KEGG and Reactome analyses, were also performed (Supplementary Figure 5). Additionally, the sub ceRNA network, including 46 DELs, 3 DEMs (has-mir-30d, hasamir-195, has-mir-301a), and 6 hub genes, was built to delineate correlations among the DELs, DEMs, and hub genes ( Figure 6C and Supplementary Table 7). FIGURE 2 | The hierarchical clustering heatmaps and volcano plots for all screened differentially expressed mRNA, miRNA and lncRNA in HCC based on TCGA data. Heatmaps located in the left panels represent differential expressed (DE) mRNAs (A), miRNAs (C), and lncRNAs (E). Volcano plots located in the right panels indicate DEmRNAs (B), DEmiRNAs (D), and DElncRNAs (F) with the cutoff criteria of fold change ≥ 2 and P-value < 0.05. Red color: upregulated, green: downregulated, gray: not statistic expressed. , and the whole length on behalf of the 95% CI for each DEmiRNA. (B) Nomogram based on differentially expressed miRNAs to predict survival in HCC asymptomatic individuals. The prognostic model aim to estimate the survival rate for individual patient, meanwhile reveal the upregulated or downregulated type for each miRNA. At first, draw a line straight upwards from each miRNA to obtain the points from the points axis. Repeat this step until the total scores were gained for 23 miRNAs. Then, after calculating the overall points according to the total points axis, draw a line straight down to the 3-year and 5-year survival axis from the location of total point axis based on the obtained overall scores to indicate the rate for the specific patient (for e.g., the 3-year survival rate is 60% if a patient get the total points of 400).

Validation of Survival Analysis and Expression of Hub Genes
The correlation between expression of hub genes and OS (Supplementary Figures 6A-F) and recurrence-free survival (RFS) (Supplementary Figures 6G-L) was assessed using the GEPIA website. HCC patients with high expression levels of hub genes had a lower survival rate with respect to both OS and RFS. Among these, CEP55 had the highest prognostic P-value (0.00033 and 0.00063) correlated to overall and recurrencefree survival rates, respectively. Meanwhile, TISIDB tool that integrated five comprehensive public databases of HGNC, NCBI, Ensembl, Uniprot, and GeneCards was also exploited to evaluate the correlation to further confirm (Ru et al., 2019). It showed us consistent result that six hub genes were independently FIGURE 4 | The assessment of miRNAs-based clinical prediction model. The calibration curves according to nomogram model to estimate the survival rate at 3-year (A) and 5-year (B), the X and Y -axis represent predicted and actual survival time respectively. The efficacy of prognostic model of 3-and 5-years survival rate based on time dependent receiver operated characteristic curves (C). The Kaplan-Meier curve of overall survival time between the high-and low-risk groups stratified by the mean of total risk scores (D). associated with poor prognosis in HCC patients with statistical significance (Supplementary Figures 6M-R). mRNA levels of hub genes in various tumor and normal tissues were examined using the TIMMER database. Results indicated that expression of hub genes was higher in various tumors than in corresponding normal tissues including breast cancer, colorectal cancer, gastric cancer and others (Supplementary Figure 7). Next, we validated transcriptional expression of hub genes in another HCC cohort from Oncomine database, which demonstrated overexpression of genes in the tumor group compared to non-tumor tissues as well with cutoff P-value < 0.01 (Figures 7A-F). Moreover, the mRNA expression of hub genes was detected in MHCC-LM3 and Huh7 HCC cell lines, and L02 normal liver cell line by quantitative polymerase chain reaction (qPCR), which showed the same conclusion with Oncomine database (Figures 7G-L). Additionally, immunohistochemical data from the Human Protein Atlas was used to verify protein expression of hub genes.
Data for CLSPN was lacking in the database, and expression of DEPDC1 in both HCC and normal samples was not detected. However, the staining intensity or the range of positive areas of CEP55, KIF23, MYBL2, and RACGAP1 was higher in tumor samples (medium or high levels) than in non-tumor tissue (Figures 7M,N).

Correlation of Hub Gene Expression and Immune Infiltration
Recently, increasing evidence has demonstrated that tumorinfiltrating lymphocytes play a significant role in predicting lymph node status and survival in tumors. Therefore, we investigated whether expression of hub genes was related to immune-infiltrating levels in HCC using the TIMMER database. Results revealed that six hub genes were significantly related to immune infiltration of B cells, CD4 + T cells, CD8 + T cells, macrophages, neutrophils, and dendritic cells in HCC, with statistical P-values of <0.0001. However, expression of CLSPN (cor = 0.08, P = 1.39e-01), CEP55 (cor = 0.018, P = 7.33e-01), and MYBL2 (cor = 0.098, P = 6.97e-02) had no significant correlation with tumor purity (Supplementary Figure 8). Next, to further investigate the correlation between expression of six hub genes FIGURE 6 | The construction of hub gene associated with ceRNA network based on analysis of protein to protein interaction (PPI) network. The PPI network of DEGs (A). PPI network of 6 hub genes, the color of nodes from red to yellow indicates that the connected degrees between each molecule with others decrease gradually (B). the hub gene ceRNA regulatory network including 3 DEMs (rectangle) and 6 hub DEGs (ellipse) as well as 46 lncRNA (diamond) (C). and diverse immune infiltrating cells, the relationship between hub genes and a series of immune-related markers in various immune cells of HCC in TIMER database was focused. The analyzed immune markers included CD8 + T cell, T cell (general), B cell, monocyte, tumor-associated macrophage (TAM), M1 Macrophage, M2 Macrophage, Neutrophils, Dendritic cell, Treg ( Table 1). The results revealed that expressed levels of six hub genes were significantly related to most immune markers of various immune cell types. Importantly, we found that the expressed level of most markers of monocytes, TAM, M1and M2 macrophages showed strong correlation with hub genes of CEP55, CLSPN, DEPDC1, KIF23, and RACGAP1 except for MYBL2 which had no distinct relationship with gene markers of PTGS2, CD163, VSIG4, and MS4A4A (Table 1). Specially, we displayed that CD86, CSF1R of monocyte, CD68, IL10 of TAM, IRF5, PTSG2 of M1 macrophage, CD163, VSIG4, MS4A4A of M2 macrophage are significantly associated with hub genes in HCC (Figure 8). Furthermore, we also analyzed the correlation between hub genes and immune markers mentioned above of monocyte, TAM, M1 and M2 macrophage in HCC from GEPIA database. The correlation results are similar to those in TMER ( Table 2). These findings suggest that hub genes are correlated with immune infiltration and may regulate macrophage polarization in HCC. Additionally, whether immune infiltrating levels of each immune subset are related to differential copy number of hub genes was also evaluated. However, no significant relationship was observed between most immune cells and hub genes (Supplementary Figure 9).

Identification of Bioactive Compounds Targeting Hub Genes
Finally, we predicted potential bioactive compounds targeting hub genes using the Drugbank database, which is a comprehensive, freely accessible, online database including both drugs and drug target information. A total of 15 compounds targeting hub genes were identified, including CEP55 (Irdabisant, CEP-9722, CEP-1347, CEP-37440, Cefapirin), CLSPN (Calfactant, Calusterone), and MYBL2 (Clotrimazole, Propafenone, Letrozole, Sildenafil, Ranitidine, Valproic acid, Esomeprazole, and Pregabalin) ( Table 3). Except for Calfactant, the 3D chemical structure of the other compounds is presented in Supplementary Figure 10. These results could provide new insight into potential novel therapeutic targets for HCC in the future.

DISCUSSION
Hepatocellular carcinoma is one of the most common malignant tumors in the world. In recent years, the prevalence of HCC is gradually increasing, especially in non-traditional high incidence areas, such as the United States and Europe. Most HCC patients are likely to be diagnosed at an advanced stage since HCC is asymptomatic at early stages, and effective biomarkers for early diagnosis and prognostic prediction are lacking (Lei et al., 2019;Li et al., 2019b;Liu et al., 2019). Currently, the main treatments for HCC include radiofrequency ablation, surgical resection, immunotherapy, and liver implantation, among others. However, the clinical efficacy of treatments for specific patients is not satisfactory due to poor therapeutic targets, tumor immune escape and complications, leading to a low 5-year survival rate (Yue et al., 2019;Zhang et al., 2019;. In past decades, studies focused on immunotherapy for various cancers have obtained meaningful breakthroughs, especially in melanoma, non-small cell lung cancer, and others. Different types of immune cells infiltrate into the tumor microenvironment, which is a crucial reason for effective immune responses (Pan et al., 2019;Chen et al., 2020). It has been reported that immune cells, including natural killer cells, CD4 + and CD8 + Tcells, TAMs, and dendritic cells, and others, are detected in cancer tissues, including HCC. Furthermore, these immune infiltrating cells are regulated by various mediators, such as chemokines. However, the mechanism of immune infiltration in the development of tumors is not completely understood (Sousa and Maatta, 2016;Siemers et al., 2017). Therefore, it is urgent and significant to elucidate the molecular mechanism and to identify immune-related signatures in HCC, which will aid in identifying new therapeutic targets and prognostic markers to increase the clinical efficacy and 5-year survival rate of HCC patients.
Currently, the ceRNA hypothesis of crosstalk between ncRNAs and mRNAs has received much attention and is considered a new measure of gene regulation at the posttranscriptional level, which provides new insight into revealing mechanisms of tumorigenesis and identifying potential diagnostic and prognostic biomarkers. A growing number of published studies have demonstrated that many predictive signatures are detected in various tumors based on ceRNA network analysis Wang M. et al., 2019;Xu et al., 2019;Wu et al., 2020;Xiao, 2020). MiRNAs, included in ncRNAs, are identified to be evolutionarily conserved, with an average length of 22-nt, and may bind to the 3 untranslated region (3 UTR) of the targeted mRNAs according to the principle of complementary base pairing. An increasing body of evidence has demonstrated that dysregulated miRNAs play a crucial role in the initiation, progression, and therapy of various tumors (Long et al., 2019;Lou et al., 2019;Zhang et al., 2019). Baolei et al. revealed that miRNA-124 is a negative regulator of HCC with respect to proliferation and invasion by downregulating lncRNA-UCA1 . Baltruskeviciene et al. (2017) found that downregulated expression of miRNA-148a and miRNA-625-3p is related to tumor budding in colorectal cancer, and EMT was considered a possible molecular mechanism.
In the present study, a prognostic predictive model was established based on 23 DEMs and exhibited great performance with the area under ROC, which was 0.804 for 3-year and 0.744 for 5-year survival. Moreover, we constructed a ceRNA network, including 56 DELs, 6 DEMs (hsa-mir-9-1,hsa-mir-9-2, hsa-mir-30d, hsa-mir-139, hsa-mir-195, hsa-mir-301a), and 28 DEGs, which identified several potential prognostic  signatures for HCC. The finally screened 6 DEMs by ceRNA network were independently predicted factors related to poor prognosis in HCC patients. The major pathways that 6 DEMs participated were enriched in cancer-related regulatory signals. These findings here maybe give us a new avenue to early detect HCC patients, and evaluate clinical prognosis before or after receiving treatment via detecting the expressed level of specific DEMs mentioned above. U Lehmann et al. reported that hypermethylation of hsa-mir-9-1 is related to the development of breast cancer. Patients with pre-invasive intraductal lesions were detected by hypermethylated hsa-mir-9-1 (Lehmann et al., 2008). Notably, hsa-mir-195 belongs to the miR-195 family and is located on chromosome 17p13.1, and is correlated with proliferation and angiogenesis in prostate tumors by downregulating expression of the PRR11 gene (Cai et al., 2018). In Sannigrahi et al. (2017) research, they indicated that hsa-mir-139 was downregulated by HPV-16, leading to activation of HPV-16 oncogenic pathways and carcinogenesis of HPV-16 induced cervical and head and neck cancers. Hsa-mir-301a was found to play an oncogenic role in the occurrence and development of laryngeal squamous cell carcinoma (LSCC) by directly targeting the tumor suppressor gene Smad44 and downregulating its expression (Lu et al., 2015). Through analysis of SNP-array data generated from 8 medulloblastoma cell lines, Lu et al. (2009) found that hsa-mir-30d is overexpressed, however, the potentially involved biological processes and molecular mechanisms are still not clarified. Therefore, according to current studies, it has revealed that the 6 DEMs exhibited various biological functions in several cancer types, but the research progress is still limited especially in HCC. Further studies focused on both functions and molecular mechanisms in HCC were necessary. The discoveries in this study would be helpful to provide some clues for proceeding with the process. In this study, six hub genes (including CEP55, DEPDC1, CLSPN, KIF23, MYBL2, and RACGAP1) were identified after comprehensive bioinformatic analysis in HCC cohort from TCGA, and were all overexpressed in HCC patients, which was validated in mRNA and protein expressed level based on various HCC cohorts from Oncomine and experimental data in HCC cell lines and normal liver cell line by qPCR method. Meanwhile, these hub genes were independently predicted factors associated with poor prognosis in HCC patients in various databases of GEPIA and TISIDB. The consistent results from databases and experimental approaches, to large extent, enhance the reliability of these findings. It suggested that the six hub genes would be potential biomarkers for evaluating clinical prognosis in HCC patients, and also provide some valuable clues for further study involved in the progression of HCC. Except for RACGAP1 that was the first time found to correlate to the progression in HCC, the other hub genes have also been reported in several tumors associated with biological process or regulatory pathways, although nowadays existing data is limited and more in-depth investigations are required. It was reported that CEP55 could promote proliferation and invasion in the progress of osteosarcoma. In another study, data showed that CEPP55 facilitates the EMT process in renal cell (RCC) cancer and participates in the AKT pathway (Xu et al., 2018;Chen et al., 2019). DEPDC1 has been identified in various cancers, including HCC, breast cancer, and prostate cancer, among others, and is positively involved with multiple tumorous biological processes, including proliferation, invasion etc. Guo et al., 2019;. CLSPN was demonstrated to be overexpressed in RCC as assessed by immunohistochemistry in 95 RCC cases. It was also found that CLSPN activates the AKT signaling pathway and is co-expressed with several known tumor-related genes, such as programmed death ligand-1 (Kobayashi et al., 2020). Previous studies have reported that KIF23 is a kinesin-like motor protein and has two splice variants, KIF V1 and KIFV2, which are overexpressed in HCC samples but were not detected in non-tumor tissues. HCC patients with positive identified KIF V1 had a better overall 5-year survival rate than those with no KIF V1. However, there was no significant correlation between expression of KIF V2 and overall survival in patients (Sun et al., 2015;Wei et al., 2019). These discoveries by Xiaotong et al. conflicted with results in this study, indicating that additional investigations should be performed to further elucidate the role of KIF23 in the tumorigenesis of HCC (Sun et al., 2015). MYBL2 (MYB proto-oncogene like 2) is included in the family of MYB transcription factors and was overexpressed in breast cancer. Jianlin et al. revealed that overexpressed MYBL2 in breast cancer promotes growth and metastasis (Chen and Chen, 2018).
Additionally, in our study, we also found that all six hub genes were positively correlated with immune cell infiltration of B cells, CD4 + T cells, CD8 + T cells, macrophages, neutrophils, and dendritic cells in HCC. Furthermore, our analysis revealed that the expressed level of hub genes was correlated with Annotation: Tumor, correlation analysis in HCC tissue of TCGA. Normal, correlation analysis in normal liver tissue of TCGA. R, R-value of Pearson's correlation. *P < 0.05; **P < 0.01; ***P < 0.001; and ****P < 0.0001. most immune-related markers of various immune cell types. Interestingly, hub genes of CEP55, CLSPN, DEPDC1, KIF23, and RACGAP1 were significantly related to markers of macrophages, showing its correlation to macrophage polarization. Until now, the correlation of six hub genes and immune infiltration in tumors has not been reported yet. Through an integrated bioinformatics analysis based on a ceRNA network, we identified several new biomarkers related to immune infiltration in HCC, which may represent potential prognostic and therapeutic targets. However, more in-depth experimental studies are required to further verify the function and elaborate on the underlying mechanisms in HCC. Finally, based on analysis of the Drugbank database, we found a total of 15 bioactive compounds targeting CEP55, CLSPN, and MYBL2, providing some new directions for drug development for HCC treatments in the future. Meanwhile, it suggested these genes could be potential values for predicting the efficacy of clinical treatment.

CONCLUSION
In all, a simple-to-use nomogram predictive model was established based on miRNAs revealed great performance. We also constructed a ceRNA regulatory network to better understand the interactions between mRNAs and ncRNAs in HCC. Moreover, six hub genes were identified through PPI network analysis, all of which are overexpressed in HCC and are associated with survival. Besides, the expression of hub genes was independently correlated to poor prognosis in HCC patients, and was closely associated with immune infiltration in HCC. We believe these genes may be involved in the development of HCC and may represent potential prognostic biomarkers and individual therapeutic targets. However, further in-depth experiments are required to clarify detailed functions and mechanisms.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
FP and YiZ designed the experiments and revised the manuscript. ZP analyzed the data and wrote the manuscript. XW, YunZ, and YuaZ searched and helped to analyze the data. All authors read and consent the final manuscript.