A Key mRNA-miRNA-lncRNA Competing Endogenous RNA Triple Sub-network Linked to Diagnosis and Prognosis of Hepatocellular Carcinoma

Growing evidence has illustrated critical roles of competing endogenous RNA (ceRNA) regulatory network in human cancers including hepatocellular carcinoma. In this study, we aimed to find promising diagnostic and prognostic biomarkers for patients with hepatocellular carcinoma. Three novel unfavorable prognosis-associated genes (CELSR3, GPSM2, and CHEK1) was first identified. We also demonstrated that these genes were significantly upregulated in hepatocellular carcinoma cell lines and tissues. Next, 154 potential miRNAs of CELSR3, GPSM2, and CHEK1 were predicted. CHEK1-hsa-mir-195-5p/hsa-mir-497-5p and GPSM2-hsa-mir-122-5p axes were defined as two key pathways in carcinogenesis of hepatocellular carcinoma by combination of in silico analysis and experimental validation. Subsequently, lncRNAs binding to hsa-mir-195-5p, hsa-mir-497-5p, and hsa-mir-122-5p were predicted via starBase and miRNet databases. After performing expression analysis and survival analysis for these predicted lncRNAs, we showed that nine lncRNAs (SNHG1, SNHG12, LINC00511, HCG18, FGD5-AS1, CERS6-AS1, NUTM2A-AS1, SNHG16, and ASB16-AS1) were markedly increased in hepatocellular carcinoma and their upregulation indicated poor prognosis. Moreover, a similar mRNA-miRNA-lncRNA analysis for six “known” genes (CLEC3B, DNASE1L3, PTTG1, KIF2C, XPO5, and UBE2S) was performed. Subsequently, a comprehensive mRNA-miRNA-lncRNA triple ceRNA network linked to prognosis of patients with hepatocellular carcinoma was established. Moreover, all RNAs in this network exhibited significantly diagnostic values for patients with hepatocellular carcinoma. In summary, the current study constructed a mRNA-miRNA-lncRNA ceRNA network associated with diagnosis and prognosis of hepatocellular carcinoma.


INTRODUCTION
Hepatocellular carcinoma is one of the most common types of cancer, with nearly 600,000 newly-diagnosed cases of patients with hepatocellular carcinoma (1)(2)(3).
Although great advances in the therapeutic approaches have been achieved in the past decades, the prognosis of patients with hepatocellular carcinoma remains dismal, with five-year survival rates less than 20% (4). Most of patients diagnosed at advanced stage of hepatocellular carcinoma and high incidence of recurrence and metastasis after curative therapies may account for this poor prognosis (5). Therefore, exploring detailed mechanisms of pathogenesis of hepatocellular carcinoma and identifying promising diagnostic and prognostic biomarkers of hepatocellular carcinoma may be helpful for providing effective therapeutic targets and improving patients' outcome. A hypothesis, namely competing endogenous RNA (ceRNA), that non-coding RNA (ncRNA), including long noncoding RNA (lncRNA), can cross-talk with messenger RNA (mRNA) by competitively binding to shared miRNAs was proposed by Salmena et al. (6). In recent years, a variety of studies regarding roles of competing endogenous RNA (ceRNA) regulatory network in human cancers have been launched and lots of attractive findings have been obtained. For example, Li et al. (7) identified prognostic signatures associated with longterm overall survival of thyroid cancer patients based on a competing endogenous RNA network; Wang et al. (8) found some prognostic markers for glioblastoma by ceRNA network analysis; Wang et al. (9) constructed a mRNA-miRNA-lncRNA competing endogenous RNA triple sub-network associated with prognosis of pancreatic cancer. However, current knowledge about ceRNA regulatory network in hepatocellular carcinoma remains extremely limited and need to be further probed. In this study, by employing "mRNA-miRNA-lncRNA" order pattern instead of "lncRNA-miRNA-mRNA" order pattern, we constructed a comprehensive ceRNA sub-network, in which all RNAs possess significant diagnostic and prognostic values for patients with hepatocellular carcinoma.

GEPIA Database Analysis
GEPIA (Gene Expression Profiling Interactive Analysis, http:// gepia.cancer-pku.cn/detail.php) is a newly developed interactive web server for analyzing the RNA sequencing expression data from The Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression (GTEx) projects (10). GEPIA was employed to obtain the genes most associated with overall survival and disease-free survival of patients with hepatocellular carcinoma. Logrank P < 0.05 was considered as statistically significant. GEPIA was also used to perform expression analysis and survival analysis of lncRNA. The expression correlation between potential mRNA-lncRNA pairs was also determined by GEPIA. For expression analysis, P < 0.05 was considered as statistically significant. For correlation analysis, R > 0.1 and P < 0.05 were set as the criteria for identifying significant mRNA-lncRNA interactive pairs. UALCAN Database Analysis mRNA expression slevels of three candidate genes (CELSR3, GPSM2, and CHEK1) in hepatocellular carcinoma were further detected using UALCAN database (http://ualcan.path.uab.edu/ index.html), which is a comprehensive, user-friendly, and interactive web resource for analyzing cancer data (11). P < 0.05 was considered as statistically significant.

Oncomine Database Analysis
Oncomine (https://www.oncomine.org/) is a cancer microarray database and an integrated data-mining platform (12). In this study, Oncomine was utilized to analyze mRNA expression of CELSR2, GPSM2, and CHEK1 in hepatocellular carcinoma by conducting a meta-analysis of datasets as we previously described (13). P < 0.05 and |fold change| > 1.5 were set as the thresholds for selecting included datasets.

HumanProteinAtlas Database Analysis
HumanProteinAtlas (https://www.proteinatlas.org/) is a Swedish-based database, which was initiated in 2003 with the aim to map all the human proteins in cells, tissues and organs (14)(15)(16). CELSR2, GPSM2, and CHEK1 protein expression levels in hepatocellular carcinoma were assessed using HumanProteinAtlas database.
miRNet Database Analysis miRNet (http://www.mirnet.ca/), an easy-to-use online tool for miRNA-associated studies, was used to predict potential miRNAs binding to mRNAs (17,18). Besides, it was also employed to predict potential lncRNAs binding to miRNAs. mRNA-miRNA and miRNA-lncRNA regulatory networks were subsequently established by Cytoscape software.
StarBase Database Analysis starBase (http://starbase.sysu.edu.cn/) is an open-source database for investigating non-coding RNA interactions from CLIPseq, degradome-seq and RNA-RNA interactome data (19,20). starBase was introduced to perform expression correlation analysis for mRNA-miRNA and miRNA-lncRNA pairs in hepatocellular carcinoma. R < −0.1 and P < 0.05 were set as the criteria for identifying significant interactions. miRNA expression values in hepatocellular carcinoma were also determined using starBase. P < 0.05 was considered as statistically significant. Besides, starBase was employed to predict potential lncRNAs binding to miRNAs.

Kaplan-Meier Plotter Analysis
Kaplan-Meier plotter database is capable to assess the effect of miRNAs and genes on survival in 21 cancer types, including hepatocellular carcinoma (21). The prognostic values of potential miRNAs in hepatocellular carcinoma was evaluated using Kaplan-Meier plotter (http://kmplot.com/analysis/) as we previously described (22,23). In brief, each miRNA of interest was first entered into this database. According to median expression value, all cases were classified into a low expression group and a high expression group. Subsequently, Kaplan-Meier survival plots were generated, and statistical indices containing hazard ratio (HR), 95% confidence interval (CI),

Statistical Analysis
Most of the statistical analyses were done by the bioinformatic online tools as mentioned above. P-values from GEPIA expression analysis, logrank P-values from GEPIA and Kaplan-Meier plotter survival analysis were corrected by false discovery rate and other reported P-values by online tools were not adjusted for false discovery rate correction. The statistical analyses of experimental data were conducted by GraphPad Prism software (version 7.0.3). Experiments were performed in triplicates and shown as mean ± standard deviation (SD) from at least three independent times. Student's t-test (two tailed) were employed to do comparisons between two groups. ROC curve was utilized to assess diagnostic effect. Logrank P < 0.05 or P < 0.05 was considered as statistically significant.

CELSR3, GPSM2, and CHEK1 Were Identified as Three Novel Prognosis-Associated Genes in Hepatocellular Carcinoma
To obtain the genes most associated with patient survival in hepatocellular carcinoma, GEPIA database was first utilized. In this study, two indices regarding to patients' outcome, overall survival (OS) and disease-free survival (RFS), were included. The top 100 OS-associated genes and the top 100 RFS-associated genes were identified as listed in Tables 1, 2, respectively. By intersecting OS-associated genes and RFSassociated genes, 9 genes (CLEC3B, DNASE1L3, PTTG1, CELSR3, GPSM2, KIF2C, XPO5, UBE2S, and CHEK1) were defined as candidate genes, which were commonly appeared in OS-associated gene set and RFS-associated gene set ( Figure 1A). After reviewing the published literatures and previous studies, we found that 6 of 9 genes [CLEC3B (26), DNASE1L3 (27), PTTG1 (28,29), KIF2C (30), XPO5 (31,32), and UBE2S (33, 34)] have been demonstrated to act as promising prognostic biomarkers for hepatocellular carcinoma. The rest three genes (CELSR3, GPSM2, and CHEK1) have not been studied for their prognostic values in hepatocellular carcinoma so far. Therefore, CELSR3, GPSM2 and CHEK1 were considered as three novel potential prognostic biomarkers for hepatocellular carcinoma.
The prognostic values (OS and RFS) of CELSR3, GPSM2, and CHEK1 were presented in Figures 1B-G. The results suggested that high expression of CELSR3, GPSM2, or CHEK1 indicated poor prognosis in patients with hepatocellular carcinoma.

CELSR3, GPSM2, and CHEK1 Were Upregulated in Hepatocellular Carcinoma
Next, we intended to determine expression levels of three novel prognosis-associated genes in hepatocellular carcinoma by bioinformatic analysis and experimental validation. Firstly, we detected their expression in TCGA hepatocellular carcinoma tissues and normal tissues using UALCAN database. CELSR3, GPSM2, and CHEK1 were significantly upregulated in tumor samples compared with normal samples as shown in Figures 2A-C, respectively. Next, Oncomine database was further introduced to analyze CELSR3, GPSM2, and CHEK1 expression in hepatocellular carcinoma. The datasets met the thresholds of P < 0.05 and |fold change| > 1.5 were included for conducting meta-analysis. As presented in Figures 2D-F, CELSR3, GPSM2, and CHEK1 expression were markedly higher in hepatocellular carcinoma than that in normal controls.  Protein expression levels of CELSR3, GPSM2, and CHEK1 were evaluated using HumanProteinAtlas database. Figures S1A,B suggested that both CELSR3 and GPSM2 protein expression levels were increased in tumor tissue compared to normal tissue. However, CHEK1 protein value was not included in HumanProteinAtlas database. Subsequently, using qRT-PCR, we found that expression levels of CELSR3 (Figure 2G), GPSM2 (Figure 2H), and CHEK1 ( Figure 2I) were obviously increased in two hepatocellular carcinoma cell lines (hepG2 and LM3) when compared with normal hepatic cell line (HL7702). Moreover, compared to normal hepatic tissues, CELSR3, GPSM2, and CHEK1 expression levels were significantly upregulated in collected hepatocellular carcinoma clinical tissues as shown in Figures 2J-L, respectively. All these findings demonstrate that CELSR3, GPSM2, and CHEK1 were upregulated in hepatocellular carcinoma and linked to prognosis of patients with hepatocellular carcinoma.
Prediction and Validation of Potential miRNAs Binding to CELSR3, GPSM2, and CHEK1 Next, we predicted upstream regulatory miRNAs of CELSR3, GPSM2, and CHEK1 through a comprehensive miRNA studyassociated database, miRNet. A total of 156 mRNA-miRNA pairs, including 71 CELSR3-miRNA pairs, 52 GPSM2-miRNA pairs and 33 CHEK1-miRNA pairs, were acquired. For better visualization, mRNA-miRNA interactive network was constructed using Cytoscape software as presented in Figure 3. According to the classic action mechanism of miRNA in negative regulation of gene expression, there should be inverse expression relationship between the predicted mRNA-miRNA interactions. Thus, we employed starBase database to perform expression correlation analysis for these mRNA-miRNA interactions in hepatocellular carcinoma. The analytic results were shown in Table S2. Those mRNA-miRNA pairs FIGURE 4 | Significantly correlated mRNA-miRNA pairs determined by starBase database. (A) Expression of hsa-mir-30a-5p was negatively associated with CELSR3 expression; (B) expression of hsa-mir-4646-3p was negatively associated with CELSR3 expression; (C) expression of hsa-mir-195-5p was negatively associated with CHEK1 expression; (D) expression of hsa-mir-193b-3p was negatively associated with CHEK1 expression; (E) expression of hsa-mir-497-5p was negatively associated with CHEK1 expression; (F) expression of hsa-mir-139-3p was negatively associated with CHEK1 expression; (G) expression of hsa-mir-122-5p was negatively associated with GPSM2 expression; (H) expression of hsa-mir-378a-5p was negatively associated with GPSM2 expression.

DISCUSSION
Hepatocellular carcinoma is notorious for its poor prognosis and high aggressiveness. Elucidation of molecular mechanisms  of hepatocellular carcinoma pathogenesis and identification of promising biomarkers for diagnosis and prognosis of hepatocellular carcinoma are critical for updating therapeutic approaches and improving patients' outcome. ceRNA regulatory network has been reported to participate in initiation and progression of human cancers. To our knowledge, a comprehensive ceRNA regulatory network based on the order model of mRNA-miRNA-lncRNA in hepatocellular carcinoma has not been constructed so far. Therefore, we tried to establish a prognosis/diagnosis-associated mRNA-miRNA-lncRNA ceRNA triple sub-network in hepatocellular carcinoma. By performing survival (OS and RFS) analysis using TCGA hepatocellular carcinoma data, nine genes including three novel genes (CELSR3, GPSM2 and CHEK1) and six "known" genes [CLEC3B (26), DNASE1L3 (27), PTTG1 (28,29), KIF2C (30), XPO5 (31, 32) and UBE2S (33,34)] were identified as prognosisassociated genes in hepatocellular carcinoma. The prognostic values of the three novel genes in hepatocellular carcinoma have not been investigated so far. However, they have been well-documented to function as key oncogenes and biomarkers in various types of cancer. For example, Pan et al. (37) reported that CELSR3 may be a promising prognostic gene in head and neck squamous cell carcinoma; He et al. (38) demonstrated that GPSM2 facilitated tumor growth and metastasis; CHEK1 downregulation suppressed by hsa-mir-195-5p hindered growth and metastasis of non-small cell lung cancer (39). In silico analysis and experimental validation together suggested that the three genes were significantly upregulated in hepatocellular carcinoma. All these findings indicate that high expression of CELSR3, GPSM2, or CHEK1 links to unfavorable prognosis of patients with hepatocellular carcinoma.
By integrating these mRNA-miRNA and miRNA-lncRNA interactions, a potential mRNA-miRNA-lncRNA ceRNA triple sub-network associated with prognosis of hepatocellular carcinoma was constructed. Moreover, ROC curve analysis for these RNAs in the established network revealed that each of them possessed high diagnostic value for hepatocellular carcinoma, further suggesting the key roles of this mRNA-miRNA-lncRNA sub-network in hepatocellular carcinoma. Of course, more corresponding studies need to be launched in the future.

CONCLUSION
In conclusion, using in silico analysis and experimental validation, we established a comprehensive mRNA-miRNA-lncRNA triple ceRNA network in hepatocellular carcinoma. Each RNA in this network possesses significant expression difference between hepatocellular carcinoma and normal control and has promising diagnostic and prognostic values for patients with hepatocellular carcinoma. However, these findings need to be further confirmed by more basic experiments and larger-scale clinical trials in the future.

DATA AVAILABILITY STATEMENT
The datasets analyzed in this study can be found in the GEPIA, UALCAN, and Oncomine databases.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Ethical Committee of First Affiliated Hospital of Zhejiang University School of Medicine. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
JZ designed the study, wrote the manuscript and performed in silico analysis of the data. WL conducted experiments and revised the manuscript. JZ and WL have read and approved the final manuscript.