Prognostic lncRNAs, miRNAs, and mRNAs Form a Competing Endogenous RNA Network in Colon Cancer

Purpose: To develop a multi-RNA-based model to provide survival risk prediction for colon cancer by constructing a competing endogenous RNAs (ceRNAs) network. Methods: The prognostic information and expression of the lncRNAs, miRNAs, and mRNAs in colon cancer specimens from The Cancer Genome Atlas (TCGA) were assessed. Constructing prognostic models used the differentially expressed RNAs. Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses and Gene Ontology were used to identify the functional role of the ceRNA network in the prognosis of colon cancer. Results: Five lncRNAs (AC007384.1, AC002511.1, AC012640.1, C17orf82, and AP001619.1), 8 miRNAs (hsa-mir-141, hsa-mir-150, hsa-mir-375, hsa-mir-96, hsa-mir-107, hsa-mir-106a, hsa-mir-200a, and hsa-mir-1271), and 5 mRNAs (BDNF, KLF4, SESN2, SMOC1, and TRIB3) were highly correlated with tumor status and tumor stage. Three prognostic models based on the 5 lncRNAs, 8 miRNAs, and 5 mRNAs were constructed. The prognostic ability was 0.850 for the lncRNA-based model, 0.811 for the miRNA-based model, and 0.770 for the mRNA-based model. Patients with high-risk scores revealed worse overall survival. The KEGG pathways were significantly enriched in the “neuroactive ligand-receptor interaction.” Conclusion: This study identified several potential prognostic biomarkers to construct a multi-RNA-based prognostic model for colon cancer.


INTRODUCTION
Colon cancer is one of the most common malignancies worldwide. The morbidity and mortality of colon cancer are increasing rapidly (1). Radical resection combined with chemotherapy is used to improve survival (2). However, treatment outcomes remain unsatisfactory. The 5-years overall survival ranges from 50 to 65% (3)(4)(5)(6). Moreover, treatment outcomes differ among patients in the same stage. Therefore, the identification of prognostic factors will lead to better interventions. Until now, the prognosis of colon cancer has mainly depended on the TNM stage. However, the TNM stage is based on anatomical information; it does not reflect the biological heterogeneity of colon cancer. Thus, understanding the molecular mechanism of the initiation and progression of colon cancer may provide effective prognostic biomarkers for patients with a poor prognosis.
Altered lncRNA expression is involved in the onset and development of colon cancer (7,8). The mechanism of competing endogenous RNAs (ceRNAs) was proposed as a specific regulatory pathway of lncRNAs to explain how they exert their influence on protein levels (9,10). Many studies reported that the ceRNA network might be a marker for prognosis in colorectal cancer (11)(12)(13)(14)(15). However, few studies have assessed the ceRNA network in colon cancer (16,17). Li et al. (16) constructed a colon cancer associated ceRNA network which included 9 lncRNAs, 13 miRNAs, and 70 mRNAs. However, the study did not construct a prognostic model for colon cancer. The study assessed only the relationship between a single RNA and overall survival. As a multistep disease, colon carcinogenesis represents the accumulation of various genetic alternations and their complicated interactions. Thus, to assess the effect of a ceRNA network in colon cancer is important.
In this study, we comprehensively analyzed the prognostic roles of lncRNAs, miRNAs, and mRNAs in colon cancer to develop a multi-RNA-based model that can be used to predict survivals.

Data Processing
The clinical information and RNA data of colon cancer patients were downloaded from The Cancer Genome Atlas (TCGA) dataset (https://cancergenome.nih.gov/). MRNAs and lncRNAs with an expression value >10 were included. This study retained miRNAs with log2(RPM +1) >1 in more than 75% of samples. Inclusion criteria: (1) colon cancer patients with complete information on age, gender, tumor status, tumor stage, T stage, N stage, M stage, and histological type; (2) patients with complete lncRNA-seq, mRNA-seq, and miRNA-seq data; and (3) patients with a follow-up time of ≥30 days. Finally, 473 colon cancer tissues and 41 adjacent non-tumor tissue were examined.

Differential Expressed RNAs Analysis
Using the edgeR package of R software to identify the differentially expressed lncRNAs, miRNAs, and mRNAs between colon cancer and adjacent normal tissues. Using fold change (FC) and associated P-values to assess expression differences. |FC| > 2 and P < 0.05 were considered statistically significant. The expression profiles of the lncRNAs, miRNAs, and mRNAs were converted to log2 (normalized value +1) values after normalization by edgeR.

Survival Analysis
Patients with an overall survival time of ≥90 days were included in the survival analysis. Using univariate Cox regression to assess the associations between overall survival and the lncRNAs, miRNAs, and mRNAs. Genes with a P < 0.001 in the univariate Cox regression analysis were included into the multivariate Cox regression analysis. To constructed LncRNA, miRNA, and mRNA signature scores by multiplying the expression levels of independent biomarkers (P < 0.01). Using a time-dependent receiver operating characteristic (ROC) curve analysis to assess the prognostic accuracy of each model. The area under the curve (AUC) was used to assess the prognostic accuracy. Using R software version 3.3.3 and the "survivalROC" package to perform the time-dependent ROC curve analysis. Finally, the associations between overall survival and the differentially expressed lncRNAs, miRNAs, and mRNAs were assessed in 347 colon cancer patients.

Functional Analysis
Using Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses and Gene Ontology (GO) to assess the functional role of the ceRNA network in the prognosis of colon cancer. Using the clusterProfiler package of R software to conduct these functional enrichment analyses. Using the GOplot package of R software to reveal the results of the KEGG and GO analyses. P < 0.05 was the cut-off value.

CeRNA Network Construction
We constructed the ceRNA network based on the identified prognostic lncRNAs, miRNAs, and mRNAs. Using the miRcode (http://www.mircode.org/) database to predict lncRNA-miRNA interactions. Using TargetScan (http://www.targetscan.org/), miRTarBase (http://mirtarbase.mbc.nctu.edu.tw/php/index. php), and miRDB (http://www.mirdb.org/) to identify miRNA-mRNA interactions. We established a ceRNA regulatory network based on lncRNA-miRNA-mRNA axes by combining lncRNA-miRNA interactions with miRNA-target gene interactions. The ceRNA network is a complex posttranscriptional regulatory network. In the ceRNA network, lncRNAs, mRNAs, and other RNAs act as natural miRNA sponges to suppress miRNA functions by sharing one or more miRNA response elements. A lncRNA that harbors a similar sequence to its targeted miRNA functions as a ceRNA, regulates the level of the encoded protein and participates in the regulation of cell biology by sponging miRNAs. Using Cytoscape v3.6.0 to construct and visualize the co-expression network (18).

Statistical Analysis
Using t-tests to assess the relationships between clinical characteristics and the gene expression profiles. Using the  Kaplan-Meier method to assess survival rates. Using the logrank test to assess differences between survival curves. R software (version 3.3.3) and GraphPad Prism 5 were used to plot the figures. Statistical analyses were conducted using SPSS 24.0 (SPSS, Inc., Chicago, IL, USA). A two-tailed P < 0.05 was considered statistically significant.

Survival-Associated RNAs
The associations between the differentially expressed lncRNAs, miRNAs, and mRNAs and overall survival were assessed in 347 colon cancer patients with at least 90 days of overall survival. This study performed a univariate survival analysis to identify overall survival related RNAs. The top 15 overall survival related lncRNAs, miRNAs, and mRNAs identified by the univariate analysis are presented in Figures 2A-C, respectively. Gene interaction networks were established with STRING using the overall survival-associated mRNAs (P < 0.001). The  Cytoscape analysis revealed important cancer pathways and hub genes (Figure 3).

Functional Analysis
A biological enrichment analysis (through KEGG pathways and GO analysis) was performed to further analyse the biological functions of the ceRNAs. The results of GO analysis showed that these genes are involved in some regulation of system processes ( Figure 4A).The cellular component process found that the target genes are mainly clustered into the proteinaceous extracellular matrix (Figure 4B). Regarding the molecular functions process, the target genes are significantly related to passive transmembrane transporter activity ( Figure 4C). The KEGG pathways were mainly enriched in the "neuroactive ligand-receptor interaction" (Figure 4D).
Predictive survival models were constructed based on the 5 lncRNAs, 8  The risk score for each patient was calculated. Then we ranked them by increasing scores. Based on the median risk score, patients were divided into high-risk and low-risk groups. Patients with high-risk scores had worse overall survival than those with low-risk scores in all three models (Figure 5). The AUC of the risk score revealed that all three models showed prognostic assessment ability: 0.850 for the lncRNA-based model, 0.811 for the miRNA-based model, and 0.770 for the mRNA-based model (Figure 6).     showed that the scores assigned to each patient have a good prognostic assessment. Figure 7 also showed the expression patterns of these RNAs in the low-risk and high-risk groups.

Survival-Related ceRNA Network
An integrated lncRNA-miRNA-mRNA network was conducted by combining the lncRNA-miRNA interactions with the miRNA-mRNA interactions (Figure 8). The ceRNA network provides new insights into the diagnosis and prognosis of colon cancer. However, some molecules in this ceRNA network are not well understood in colon cancer. A comprehensive integrative analysis was performed. Figure 9 shows the mRNAs correlated with the prognosis of colon cancer patients.

DISCUSSION
This study identified distinct lncRNAs, mRNAs, and miRNAs to gain insights into the molecular events associated with colon cancer prognosis. Moreover, a prognostic ceRNA network was conducted to provide new dimensions of colon cancer prognosis.
Colon cancer is a heterogeneous disease with multiple molecular mutation. It is rarely ascribed to a single or a few genomic mutations alone (19). Until now, no single genetic "driver" was reported to be superior in evaluating aggressive disease (20). Hence, identifying effective prognostic markers is crucial for tailored treatment. Moreover, exploring the underlying regulatory network of biomarkers is essential for developing effective treatments.
Several studies have focused on a single type of lncRNA involved in colon cancer. Yue et al. (21) reported that FER1L4 expression was correlated with T stage, N stage, and vascular invasion. Moreover, significant differences in diseasefree survival and overall survival were found in patients with both high miR-106a-5p expression and low FER1L4 expression compared with patients with low miR-106a-5p expression and high FER1L4 expression. Zhou et al. (22) reported that lincRNA-ROR expression correlated with vascular invasion and tumor stage. The knockdown of lincRNA-ROR restored the expression of miR-145. Moreover, knockdown of lincRNA-ROR had a significant impact on colon cancer cell invasion, proliferation, and migration. Patients with low miR-145 and high lincRNA-ROR had significantly poorer survivals than those with low high miR-145 and lincRNA-ROR. The depletion of miR-145 combined with the overexpression of lincRNA-ROR may play a crucial role on prognosis evaluation and treatment of colon cancer.
Other studies constructed a ceRNA network of colon cancer. Yan et al. (23) showed that lincRNA-ROR functions as a key ceRNA in colon cancer. Silencing lincRNA-ROR inhibited  significantly colon cancer stem cell proliferation and increased the sensitivity to chemotherapy. Li et al. (16) constructed a colon cancer-associated ceRNA network. However, the study assessed only the relationship between a single RNA and overall survival and did not construct a prognostic model. Moreover, the study did not calculate the AUC of different prognostic RNAs. However, prognostic models based on multiple genes could provide more accurate predictions than single genes. Our study constructed a ceRNA network and indicated several new potential prognostic markers for colon cancer. Moreover, our study constructed three prognostic models based on the prognostic RNAs. The assessment ability was 0.850 for the lncRNA-based model, 0.811 for the miRNA-based model, and 0.770 for the mRNA-based model.
Previous studies have mainly focused on ceRNAs in colorectal cancer (11-15, 24, 25). Fan et al. (12) assessed the association between overall survival and a single RNA with different expression. However, colorectal carcinogenesis is the accumulation of various genetic alternations. in our study, the prognostic power of lncRNA-based model was 0.850, which was higher than that of models described in previous studies. The higher prognostic power may be explained by two factors: (1) our study included patients only with colon cancer and not rectal cancer. (2) We only included patients with at least 90 days of overall survival in the survival analysis.
On the other hand, an increasing number of studies reported that colon cancer can be divided according to distal and proximal colon cancer (26)(27)(28). Our research did not detect a difference between colon cancer in the left and right colons. However, Qian et al. (17) found that 20 lncRNAs from the left colon and 25 lncRNAs from the right colon were associated with overall survival. In the ceRNA network, 18 lncRNAs, 22 miRNAs, and 57 mRNAs were included in left colon cancer, while 21 lncRNAs, 27 miRNAs, and 55 mRNAs were included in t right colon cancer. We further developed prognostic models to identify potential biomarkers for prognostic purposes.
In a ceRNA network, the implementers of molecular function are mRNAs. KEGG and GO analyses are used to identify pathways that are disturbed by the molecular cluster. The KEGG pathways in ceRNA network analysis revealed that the targeted genes are mainly enriched in the "neuroactive ligandreceptor interaction." An imbalance in the homeostasis of the neuroactive ligand-receptor interaction may be the main influence of the ceRNA network on the prognosis of colon cancer. However, Li et al. (16) reported that the top six KEGG pathways are enriched significantly in cancer associated signaling pathways, such as "the Wnt signaling pathway, proteoglycans in cancer, and transcriptional misregulation in cancer." The different pathways may be attributed to the genes involved in the different models and thus need to be further assessed.
This study had several limitations. First, several novel lncRNAs that were not reported previously need to be further explored to identify their potential molecular mechanisms. Second, the whole study was based on major online databases, and the prognostic models need to be verified in actual clinical operations to further confirm their effectiveness. Third, the information of colon cancer patients from the TCGA should be assessed with another experimental method. Fourth, it would be important to assess survival with therapy (chemotherapy or targeted therapy). However, the main purpose of this study was to develop a multi-RNA-based model to provide survival risk prediction for colon cancer by constructing a ceRNAs network. Then the model could be used to optimize treatments for patients with a high risk of poor survival. Moreover, the data of the associations between survival and therapy was unavailable in the database. Thus, we could not analysis survival with therapy. We are going to assess the RNAs of this study in clinic experiment. We hope the results of clinic experiment would provide a productive finding of the associations between survival and therapy.
In conclusion, this study constructed a ceRNA network that provides functional implications for the neuroactive ligandreceptor interaction of colon cancer. The results indicate several new potential prognostic markers for colon cancer.

DATA AVAILABILITY
The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.