Skip to main content

ORIGINAL RESEARCH article

Front. Genet., 18 March 2020
Sec. Computational Genomics

LAceModule: Identification of Competing Endogenous RNA Modules by Integrating Dynamic Correlation

  • School of Computer Science and Technology, Xidian University, Xi'an, China

Competing endogenous RNAs (ceRNAs) regulate each other by competitively binding microRNAs they share. This is a vital post-transcriptional regulation mechanism and plays critical roles in physiological and pathological processes. Current computational methods for the identification of ceRNA pairs are mainly based on the correlation of the expression of ceRNA candidates and the number of shared microRNAs, without considering the sensitivity of the correlation to the expression levels of the shared microRNAs. To overcome this limitation, we introduced liquid association (LA), a dynamic correlation measure, which can evaluate the sensitivity of the correlation of ceRNAs to microRNAs, as an additional factor for the detection of ceRNAs. To this end, we firstly analyzed the effect of LA on detecting ceRNA pairs. Subsequently, we proposed an LA-based framework, termed LAceModule, to identify ceRNA modules by integrating the conventional Pearson correlation coefficient and dynamic correlation LA with multi-view non-negative matrix factorization. Using breast and liver cancer datasets, the experimental results demonstrated that LA is a useful measure in the detection of ceRNA pairs and modules. We found that the identified ceRNA modules play roles in cell adhesion, cell migration, and cell-cell communication. Furthermore, our results show that ceRNAs may represent potential drug targets and markers for the treatment and prognosis of cancer.

Introduction

MicroRNAs (length: ~22 nt) are a kind of small non-coding RNAs (Yates et al., 2013). They can interact with Argonaute protein to form the RNA-induced silencing complex. This complex binds to target RNA sequences (termed microRNA response elements, MRE) with partial complementarity, influencing the stability of target RNAs (Bartel, 2009; Yates et al., 2013). Recent studies revealed that different RNAs with microRNA response elements that bind to the same microRNAs can regulate each other by competitively binding to the microRNAs they share. This is known as the competing endogenous RNA (ceRNA) model, and these RNAs are termed ceRNAs (Salmena et al., 2011). CeRNAs can be messenger RNAs (mRNAs), long non-coding RNAs (lncRNAs), pseudogene gene transcripts, and circular RNAs (circRNAs). The detection of ceRNA can be used to explain the function of thousands of uncharacterized non-coding RNAs, and is also considered the “Rosetta stone of a hidden RNA language” (Salmena et al., 2011; Thomson and Dinger, 2016). In addition, ceRNAs play critical roles in post-transcriptional regulation. Thus, they are involved in numerous physiological and pathological progresses, such as cancer (Salmena et al., 2011; Karreth and Pandolfi, 2013; Tay et al., 2014; Qi et al., 2015). For example, the tumor suppressor gene PTEN has been demonstrated to compete for microRNAs shared with other transcripts, such as ZEB2, CNOT6L, VAPA, VCAN, in many types of cancer (e.g., glioblastoma, melanoma, prostate cancer, and breast cancer) (Tay et al., 2014; Poliseno and Pandolfi, 2015). Of note, PTENP1 (the pseudogene of PTEN), regulates the RNA levels of its cognate gene in prostate, melanoma, and clear-cell renal cell carcinoma (Poliseno et al., 2010, 2011; Johnsson et al., 2013; Yu et al., 2014). Furthermore, lncRNAs and circRNAs are also important regulators in the ceRNA model (Schmitt and Chang, 2016; Zhong et al., 2018). HULC, which is significantly upregulated in hepatocellular carcinoma, can reduce the expression and activity of miR-372 in liver cancer, which derepresses the translation of its target gene PRKACB and induces phosphorylation of the cAMP response element binding protein in liver cancer (Wang et al., 2010). CircHIPK3 inhibits the activity of mir-124 and promotes the expression of IL-6R by competitively binding to miR-124 (Zheng et al., 2016).

Owing to the large number of candidate ceRNA pairs and high cost of biological experiments, computational methods have become an efficient approach for the study of the ceRNA model (Le et al., 2017). For example, Zhou et al. (2014) constructed and investigated a ceRNA network in breast cancer; Sumazin et al. (2011) proposed a method based on conditional mutual information to infer candidate ceRNAs and analyze the ceRNA network in glioblastoma. Paci et al. (2014) used sensitivity correlation (SI) to infer the ceRNA network between lncRNAs and mRNAs in breast cancer. In that model, SI equals the difference between the Pearson correlation coefficient (PCC) and the partial correlation coefficient of ceRNAs with respect to their shared microRNAs. And List et al. (2019) further improved this method. All these methods predict ceRNA pairs based on two aspects: (1) ceRNAs should share a sufficient number of microRNAs, which can be evaluated through statistical tests, such as the hyper-geometric test (Salmena et al., 2011; Le et al., 2017); and (2) the expression of ceRNAs should be positively correlated, which can be estimated using the PCC (Chiu et al., 2015), SI (Paci et al., 2014; Do and Bozdag, 2018), or conditional mutual information (Sumazin et al., 2011). In addition, Zhang et al. (2018) proposed LncmiRSRN to construct lncRNA-mRNA ceRNA network via estimating the causal effects of lncRNAs on mRNAs with the IDA method. Besides studying ceRNAs using sequencing data, researchers also proposed mathematical models to simulate the ceRNA process, such as the minimal model (Figliuzzi et al., 2013), mass-action model (Ala et al., 2013), and coarse-grained model (Yuan et al., 2015). Recently, Wei et al. constructed a unified coarse-grained competition motif model and uncovered the complexity and generality of the molecular competition effect, including the ceRNA model (Wei et al., 2019). In that study, the investigators proposed that the strength of competition between ceRNAs is influenced by the abundance of ceRNAs and their target microRNAs (Wei et al., 2019). The strength of the correlation is maximized in the “R near-equimolar” regime, and gradually decreases with concentration of microRNAs away from the regime (Martirosyan et al., 2019; Wei et al., 2019). This means that the strength of ceRNA regulation varies based on the concentration of microRNAs. Those results also indicated that the strength of ceRNA regulation is sensitive to the expression levels of microRNAs around the proper concentration. Intuitively, two RNAs may be co-expressed due to some biological mechanisms. If they have an additional ceRNA relationship, their co-expression should be improved. When the levels of microRNA expression are low, the regulation between ceRNAs is not obvious. Furthermore, the ceRNA strength is sensitive to the expression levels of microRNAs. Hence, the strength of the co-expression should also be sensitive to these levels. We found the current studies using RNA sequencing data did not consider this characteristic of the ceRNA model. Hence, we propose that the sensitivity of co-expression to the expression levels of microRNAs they share may be a factor for predicting ceRNA relationships. We used dynamic correlation measure, termed liquid association (LA), to assess this sensitivity.

Unlike conventional correlations (e.g., PCC), dynamic correlation focuses on the change in the correlation of two variables following alterations in a third variable (Gunderson and Ho, 2014; Yu, 2018). For example, LA is defined as the mean of the derivative of the correlation between two objects with respect to a third condition (Li, 2002). LA has been used to identify disease candidate genes (Li et al., 2007) and human age-associated genes (Yang et al., 2018), as well as discover key microbial species and environmental factors of the microbial community (Ai et al., 2019).

LA is an appropriate measure for the evaluation of the correlation sensitivity of ceRNAs to microRNAs. In this study, we firstly analyzed the effectiveness of LA in detecting ceRNA pairs. Subsequently, we proposed a framework to investigate LA-based ceRNA modules (LAceModule) by integrating the conventional PCC and dynamic correlation LA with multi-view non-negative matrix factorization (NMF). By performing further analysis in breast cancer, we revealed that ceRNAs play roles in cell adhesion, cell migration, and cell-cell communication. Our results also showed that ceRNAs may represent promising drug targets and markers for the treatment and prognosis of cancer.

Results

LA for the Prediction of ceRNA Pairs

Current studies often use the PCC or SI to detect ceRNA pairs. This approach ignores the sensitivity of the correlation between RNAs to the expression levels of their shared microRNAs. To overcome this limitation, we used LA (Li, 2002) to measure the dynamic change of the correlation for a ceRNA pair depending on the expression levels of their shared microRNAs. Suppose that EXPR1 and EXPR2 represent the expression levels of two ceRNA candidates R1 and R2, respectively, while EXPMIC denotes the sum of the expression levels of all their shared microRNAs, MIC. We normalized EXPR1 and EXPR2 using the z-scoring method such that E(EXPR1) = E(EXPR2) = 0, Var(EXPR1) = Var(EXPR2) = 1, where E(·)and Var(·) represent the expectation and variance of a random variable, respectively.

Supposing the above, the PCC between R1 and R2 is:

ρ(R1,R2)=E[(EXPR1-E(EXPR1))×(EXPR2-E(EXPR2))]Var(EXPR1)×Var(EXPR2)                =E(EXPR1×EXPR2).

By conditioning, E(EXPR1 × EXPR2) = E(E(EXPR1 × EXPR2|EXPMIC)) = E(g(EXPMIC)), where g(EXPMIC) = E(EXPR1 × EXPR2|EXPMIC = expMIC) is the correlation between R1 and R2 when the expression level of the shared microRNA is expMIC.

The LA of R1 and R2 with respect to their shared microRNAs is defined as LA(R1,R2|MIC)=E(g(EXPMIC)), where g(EXPMIC) = E(EXPR1 × EXPR2|EXPMIC = expMIC). According to the Stein Lemma (Stein, 1981), if the sum of the expression levels of all the shared microRNAs MIC follows the standard normal distribution, LA(R1, R2|MIC) = E(EXPR1 × EXPR2 × EXPMIC), the calculation of LA can be simplified as shown below:

LA(R1,R2|MIC)=i=1NEXPR1i×EXPR2i×EXPMICiN

where N is the number of sample. We performed data transformation on EXPMIC using the Van der Waerden's method to ensure that EXPMIC follows the standard normal distribution. For EXPMIC1, EXPMIC2, ⋯, EXPMICN, we initially obtained their ranks r1, r2, ⋯, rN, and subsequently computed the transformed value as follows:

EXPMIC1=Φ-1(r1N+1),EXPMIC2=Φ-1(r2N+1),,EXPMICN=Φ-1(rNN+1),

where Φ(·) is the cumulative distribution function of the standard normal distribution.

We downloaded the expression profiles of mRNAs, lncRNAs, and microRNAs of 1,072 and 365 patients with BRCA and LIHC, respectively, from The Cancer Genome Atlas (TCGA) to investigate the effect of LA on the prediction of ceRNA pairs. Non-expressed genes across all samples in a given type of cancer were removed (Figure 1A). Subsequently, we downloaded 2,667 experimentally validated ceRNA pairs from the LncCeRBase (Pian et al., 2018), miRSponge (Wang et al., 2015b), and lncACTdb (Wang et al., 2015a) databases. Considering the tissue-specific characteristics of ceRNAs and gene symbol mapping, seven validated ceRNA pairs in breast cancer and six validated ceRNA pairs in liver cancer were obtained (Table 1). All pairs in our candidate ceRNAs set significantly shared microRNAs (hypergeometric test, p < 0.05). For each disease, we considered these ceRNA pairs as benchmarks and all pairs, which were consisted of two genes in benchmarks and had sufficient number of shared microRNAs, as candidate pairs. We use these two datasets to evaluate the performance of LA-, PCC-, and SI-based methods, respectively. In breast cancer, the area under curve (AUC) values of the LA, PCC, and SI approaches were 0.58, 0.45, and 0.24, respectively. In liver cancer, these AUC values were 0.46, 0.27, and 0.35, respectively (Figure 2A). PCC and SI are usually used to identify ceRNA pairs, and our results indicate that LA also has the ability to predict ceRNA pairs.

FIGURE 1
www.frontiersin.org

Figure 1. (A) Data preparation. We obtained the RNA-seq data of mRNAs and lncRNAs, as well as the microRNA-seq data of microRNAs. Subsequently, we removed non-expressed and lowly expressed RNAs. Finally, we retained RNAs that were presented in the RNA-microRNA interaction datasets (here is Mirwalk2.0) as candidate ceRNAs. (B) Overview of LAceModule. The inputs of LAceModule are candidate ceRNA expression profiles, microRNA expression profiles, and RNA-microRNA interactions. For each candidate ceRNA pair, the PCC value, LA value, and significance degree of shared microRNAs (MS-P) value can be obtained. For pairs with higher MS-P values (threshold is 0.05), negative PCC values or LA values should be removed (i.e., the PCC values and LA values of these pairs are set to zero). Multi-view NMF is executed using the PCC matrix, LA matrix, and different K as inputs. The best K is selected by comparing four clustering evaluation metrics. Subsequently, multi-view NMF procedures are repeated 10 times with the best K and different initial values. The final modules are obtained through consensus clustering of the repeat results.

TABLE 1
www.frontiersin.org

Table 1. LA, PCC, and SI values of validated ceRNA pairs.

FIGURE 2
www.frontiersin.org

Figure 2. (A) The AUC value for predicting ceRNA pairs with LA, PCC, and SI in BRCA and LIHC. (B) Cluster evaluation of three methods on different matrices in BRCA. (C) Cluster evaluation of three methods on different matrices in LIHC. (D) Comparison of the gene fold-change entropy in modules between different clustering methods. (E) Comparison of the average validated microRNA of each pair in modules between different methods. (F) Comparison of the dispersion of dysregulated genes in modules between different methods. Top row: ignoring the direction of dysregulation, bottom row: considering the direction of dysregulation. (*p < 0.05; **p < 0.01; ***p < 0.001).

Identification of ceRNA Modules Using the LAceModule

We proposed the LAceModule (Figure 1B), a framework based on multi-view NMF (Liu et al., 2013) to systematically identify ceRNA modules using LA. For each candidate ceRNA pair, we computed the PCC value, LA value, and the degree of significance of shared microRNAs (MS-P) (see section Materials and Methods) to construct three matrices MPCC, MLA, and MMSP, respectively. Subsequently, when the MS-P values of candidate ceRNA pairs were ≥0.05, we set their corresponding PCC values and LA values to zero. Owing to the non-negativity requirement in the multi-view NMF framework, we set negative values in MPCC and MLA to zero. Considering that a ceRNA pair should be co-expressed and sensitive to change in the expression of their shared microRNAs, we set the values in the same entry of MPCC and MLA of candidate ceRNA pairs to zero if either of these values was zero. Finally, we integrated MPCC and MLA using multi-view NMF to identify ceRNA modules.

For multi-view NMF, there are two observation views M = {MPCC, MLA}, each of which is a G × G non-negative matrix, where Gis the number of candidate ceRNAs. Each matrix in M, Mv ∈ {MPCC, MLA}, can be factorized to UG×Kv0 and (VG×Kv)T0 that MvUv(Vv)T and each row of (Vv)T can be considered as the K-rank representation of the corresponding candidate ceRNA point. Here, we attempted to identify a low-rank representation that is suitable for both views, which is defined as (V*)T. We factorized each matrix in M and made each (Vv)T as close as possible to (V*)T. Therefore, we determined the objective function as follows:

min(MPCCUPCC(VPCC)TF2+MLAULA(VLA)TF2+λPCCVPCCV*F2+λLAVLAV*F2)s.t. 1kK,U,kPCC1=1,U,kLA1=1 and UPCC,ULA,VPCC,VLA,V*0

where λPCC and λLA tunes the relative weight among different views and between standard NMF error and disagreement among (V*)T, (VPCC)T, and (VLA)T. We used an iterative procedure by updating one variable, while maintaining the remaining variables fixed to solve this optimization problem (see details in Materials and Methods section). After computing the (V*)T, we obtained the module label of RNA i using argmaxj=1,2,,KVij*.

Of note, the LAceModule requires pre-determination of the number of modules, K. We evaluated the clustering performance to select an optimal K ranging from 10 to 400 with an increment of 10 by considering four metrics (Figures 2B,C), namely the C-index (Hubert and Schultz, 1976), McClain-Rao (McClain and Rao, 1975), point biserial correlation coefficient (Milligan, 1981), and silhouette coefficient (Rousseeuw, 1987). By simultaneously considering four metrics on two matrices, we selected K = 360 in BRCA and K = 370 in LIHC. To obtain robust ceRNA modules, the LAceModule repeated the multi-view NMF procedures 10 times, and computed a consensus matrix to identify ceRNA modules using the cluster-based similarity partitioning algorithm (CSPA) (Strehl and Ghosh, 2003). Specifically, CSPA generates a binary matrix for each result of the multi-view NMF clustering, with “1” representing two associated genes in the same cluster and “0” for not. The consensus matrix is the sum of these binary matrices. ceRNA modules can be identified through spectral clustering on this consensus matrix using the optimal K selected above.

Comparison Between the LAceModule and PCC/SI-Based Methods

We used NMF to replace the multi-view NMF and the PCC matrix or SI matrix as input to compare the performance of conventional and dynamic correlations in the detection of ceRNA modules. In the PCC matrix and SI matrix, negative values or the corresponding MS-P values ≥0.05 were set to zero. We also tested K ranging 10–400, with an increment of 10, and evaluated the clustering performance with the same indicators mentioned in Section Identification of ceRNA modules using the LAceModule. We selected Ks equal to 350 and 360 for PCC-based and SI-based results in BRCA, respectively, while Ks equal to 360 and 340, respectively, were selected for LIHC (Figures 2B,C). In the following sections, we used “PCC+LA” to represent the modules detected by the LAceModule, as well as “PCC” and “SI” to represent the modules based on PCC or SI, respectively.

CeRNA pairs are highly co-expressed; hence, the fold change of gene expression in a ceRNA module tends to be similar in the disease and normal states. We analyzed the differential expression of genes between normal tissues and tumor tissues using the R packages edgeR (Robinson et al., 2010) (Materials and Methods section) to obtain the fold change. Subsequently, we calculated the entropy of fold change for each module. As shown in Figure 2D, the entropy distribution of “PCC+LA” was significantly lower than those of “PCC” (FDR = 0.0476 in BRCA, FDR = 5.12E-06 in LIHC; Kolmogorov–Smirnov one-tailed test) and “SI” (FDR = 8.16E-09 in BRCA, FDR = 5.12E-06 in LIHC; Kolmogorov–Smirnov one-tailed test). By comparing “PCC” and “SI,” we also found that the former was significantly lower than the latter (FDR = 7.20E-05 in BRCA, FDR = 4.76E-02 in LIHC; Kolmogorov–Smirnov one-tailed test). Similarly, if a gene in a ceRNA pair is dysregulated, the other gene also tends to be dysregulated. In addition, the direction of the dysregulation also tends to be the same. This indicates that, when a gene in a ceRNA pair is upregulated, the other gene also tends to be upregulated. Therefore, we calculated the entropy of the dysregulated gene ratio and the entropy of the different dysregulation direction ratio of each module. The results are shown in Figure 2F; the top row illustrates the dysregulated gene ratio without direction, while the bottom row shows the dysregulated gene ratio with direction. The results showed that the modules of “PCC+LA” were significantly lower than those of “PCC” (top: FDR = 0.0246 in BRCA, FDR = 6.50E-03 in LIHC; bottom: FDR = 2.62E-02 in BRCA, FDR = 7.82E-03 in LIHC; Wilcoxon one-tailed test) and “SI” (top: FDR = 5.82E-12 in BRCA, FDR = 3.47E-24 in LIHC; bottom: FDR = 6.88E-10 in BRCA, FDR = 2.59E-23 in LIHC; Wilcoxon one-tailed test) in both situations. For “PCC” and “SI”, the former performed better than the latter in two situations (top: FDR = 5.39E-06 in BRCA, FDR = 9.15E-16 in LIHC; bottom: FDR = 2.39E-04 in BRCA, FDR = 2.09E-15 in LIHC; Wilcoxon one-tailed test).

CeRNAs are regulated through shared microRNAs. Therefore, ceRNA modules may tend to share more microRNAs in each pair. We used experimentally validated mRNA-microRNA interaction in miRTarBase (Chou et al., 2016) to evaluate the average number of shared microRNAs in a pair. The results are shown in Figure 2E. The modules of “PCC+LA” shared more microRNAs on average than those of “PCC” (FDR = 1.84E-02 in BRCA, FDR = 1.84E-02 in LIHC; Wilcoxon one-tailed test) and “SI” (FDR = 1.05E-06 in BRCA, FDR = 2.62E-09 in LIHC; Wilcoxon one-tailed test). Moreover, the modules of “PCC” shared more microRNAs on average than those of “SI” (FDR = 8.46E-03 in BRCA, FDR = 3.82E-05 in LIHC; Wilcoxon one-tailed test).

Collectively, the comparisons of gene fold change, gene dysregulation ratio, and number of shared microRNAs suggest that the integration of conventional and dynamic correlations offers better detection of ceRNA modules than conventional correlation alone.

Functional Analysis of ceRNA Modules in Breast Cancer

We used the results of the BRCA dataset for further analysis. For investigating the difference in ceRNA relationship between tumor and normal states, we identified 348 ceRNA modules in breast tumor tissues and 314 modules in normal breast tissues using LAceModule. We used the g:Profiler (Reimand et al., 2007) for function enrichment of ceRNA modules in the disease and normal states. In Table 2, we listed the top five most frequently enriched Gene Ontology (GO) terms in both states. We analyzed the relationship of these GO terms (Figure 3A) and found that the ceRNA modules are associated with cell adhesion (GO:0007155) specifically in disease tissues, compared with the modules of the normal tissues. It is established that activation of invasion and metastasis is an important hallmark of cancer (Hanahan and Weinberg, 2011). Current research studies suggest that the loss of cell adhesion is strongly associated with tumor invasion and metastasis (Cavallaro and Christofori, 2001; Okegawa et al., 2004). This term and its parent term biological adhesion (GO:0022610) are enriched in 15 modules. In most of these modules, gene pairs have larger PCC and LA values in diseased cells than in normal cells (Figure 3B, Supplementary File 1). Similarly, the most frequently and specifically enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways (Supplementary Table 1) in the disease state were cell adhesion molecules (CAMs) (KEGG:004514) and focal adhesion (KEGG:04510), which play a pivotal role in tumor metastasis and cell migration (Ridley et al., 2003; Okegawa et al., 2004; Kim and Wirtz, 2013). In most of these modules, gene pairs also have larger PCC and LA values in diseased cells than in normal cells (Figures 3C,D, Supplementary Files 2, 3). These results suggest that ceRNAs play important roles in the invasion and metastasis of breast tumor cells, as previously indicated (Yang et al., 2014; Hu et al., 2017; Zheng et al., 2018).

TABLE 2
www.frontiersin.org

Table 2. Most frequently enriched GO terms.

FIGURE 3
www.frontiersin.org

Figure 3. (A) Top five most frequently enriched terms in disease and normal states. Red areas are considered common functions in disease and normal states. Green areas are considered disease-specific functions. (B–D) Heatmap of PCC and LA values in disease and normal states (top right: disease state, bottom left: normal state). (B) Module 13 enriched for GO:0007155 (C) Module 151 enriched for KEGG:045142. (D) Module 132 enriched for KEGG:04510. (PCG: protein-coding gene).

As shown in Table 3, we also investigated the most significant terms. The most significant GO terms were obtained from Module 199, and were associated with defense against other organisms, such as viruses. Numerous studies have indicated that the mouse mammary tumor virus, bovine leukemia virus, human papillomavirus, and Epstein–Barr virus are associated with breast cancer (Amarante and Watanabe, 2009; Alibek et al., 2013; Lawson et al., 2018). As shown in Figure 4A, genes in this module exhibit larger PCC and LA values. For the KEGG dataset, the most significant pathways were obtained from Module 103 (Figure 4B); these pathways were significantly enriched for oxidative phosphorylation, thermogenesis, Huntington's disease, and Parkinson's disease.

TABLE 3
www.frontiersin.org

Table 3. Most significant GO terms and KEGG pathways.

FIGURE 4
www.frontiersin.org

Figure 4. Heatmap of PCC and LA values with most significant GO (A) and KEGG (B) terms in disease and normal states (top right: disease state, bottom left: normal state, PCG: protein-coding gene, lncRNA: long-non-coding RNA).

Collectively, comparison of the functions of ceRNA modules in breast tumor and normal tissues suggested that the ceRNA relationship may rewire in different cell states and ceRNA may exert an effect on the development or progression of breast cancer.

ceRNA Modules Are Associated With Aberrant Genetic and Epigenetic Regions in Breast Cancer

According to the methods described in Materials and Methods section, we obtained 1,829 dysregulated mRNAs, 264 dysregulated lncRNAs (Supplementary File 4, Supplementary Table 2), 1,074 CNV genes, and 51 differentially methylated genes. We considered these genes as BRCA-associated genes. We identified BRCA-associated modules that are enriched for both expression-dysregulated genes (p < 0.01); and genes associated with aberrant CNV or DNA methylation (p < 0.05). Totally, we obtained eight BRCA associated modules (Table 4, Figure 5A, Supplementary File 5). The top five most significant enriched functions and pathways of these modules are shown in Figures 5B,C. We found that six of eight modules are significantly enriched for breast cancer-associated functions and pathways, such as immune response (Module 90 and Module 309) (Bates et al., 2018), cell communication (Module 110) (Brooks and Wicha, 2015), cell cycle (Module 73) (Otto and Sicinski, 2017), blood vessel morphogenesis (Module 45) (Kakolyris et al., 2000), and sucrose process (Module 37) (Jiang et al., 2016).

TABLE 4
www.frontiersin.org

Table 4. BRCA-associated ceRNA modules.

FIGURE 5
www.frontiersin.org

Figure 5. (A) The modules that are enriched for breast invasive carcinoma (BRCA)-associated genes. (B) GO enriched terms of BRCA-associated modules. (C) KEGG enriched terms of BRCA-associated modules.

We also investigated the relationship between lncRNAs and oncogenes in these modules. We downloaded the oncogene dataset from ONGENE (Liu et al., 2017). We found that in Module 45, LINC01485 was correlated with AQP1, with PCC and LA values of 0.397 and 0.087, respectively. AQP1 is related to tumor cell migration, invasion, and angiogenesis (Tomita et al., 2017). In Module 309, lncRNA HSPC324 (ENSG00000228401) was highly co-expressed with oncogene KLF2, with PCC and LA values of 0.642 and 0.012, respectively. KLF2 is a tumor suppressor gene, which inhibits cell growth, migration, and invasion in numerous types of cancer, such as colorectal cancer (Wang et al., 2017), breast cancer (Zhang et al., 2015), and prostate cancer (Wang et al., 2019). Our results indicate that these lncRNAs may be new drug targets for cancer therapy.

ceRNA Modules Predict Survival in Patients With Breast Cancer

MicroRNAs are clinically important in cancer. Therefore, we used the top 15 most commonly shared microRNAs in BRCA-associated modules to analyze their relationship with patient outcome. We used a k-means algorithm to classify patients into two groups based on the expression of the top 15 microRNAs in each module, and performed a Kaplan–Meier analysis. We found that half of the top 15 microRNAs sets in Module 45, Module 110, Module 90, and Module 309, could significantly distinguish patients (Figure 6A). By analyzing the microRNAs in these modules, we found that 10 of 30 microRNAs were shared by more than three modules (Figure 6B, Supplementary Table 3). In addition, half of those 10 microRNAs were derived from the let-7 microRNA family. Many studies have suggested that the let-7 family participates in the process of metastasis and resistance to therapy in breast cancer (Cunningham et al., 2010; Chiu et al., 2014). Our results also demonstrate that the let-7 microRNA family can be a prognostic marker of breast cancer (Figure 6B, Supplementary File 9).

FIGURE 6
www.frontiersin.org

Figure 6. (A) The log-rank test p-values of the top15 most frequently shared microRNAs in modules enriched for BRCA-associated genes; the red dashed line indicates the level of –log(0.05). (B) Upset plot of microRNAs in Module 45, 90, 110, and 309. Subplot: the Kaplan-Meier curves of five microRNAs from the let-7 family. (C) The ceRNA pair markers for BRCA therapy and prognosis and their target microRNAs in miRTarBase. (D) Kaplan–Meier curves of ceRNA pair markers.

In addition, we identified five ceRNA pairs (i.e., CDH5 with FAM212B, CDH5 with GYG2 in Module 45, TRIB1 with IL33, TRIB1 with INMT, and TRIB1 with MMRN1 in Module 110) that can be used as prognostic markers of breast cancer in BRCA-associated modules (Figures 6C,D, Supplementary File 9). In our dataset, CDH5 is a hyper-methylated gene (six of eight sites are hypermethylated in the promoter region), while TRIB1 is duplicated or amplified in >14% of patients with BRCA. Recent studies showed that CDH5 levels and CDH5 glycosylation are biomarkers for metastatic breast cancer (Fry et al., 2016) and TRIB1 plays a critical role in cell cycle and survival via NF-κB signaling (Gendelman et al., 2017). Interestingly, we found that these genes cannot individually act as prognostic markers (Supplementary File 6). However, the gene sets of the ceRNA pairs and their experimental validated shared microRNAs in miRTarBase are effective markers for therapy and prognosis, indicating that these ceRNAs may collaborate in breast cancer.

Furthermore, we also identified ceRNA modules that can be considered prognostic markers for breast cancer. Similarly, we used the expression of ceRNAs in each module to classify patients into two groups via a k-means algorithm, and performed a Kaplan–Meier analysis. In total, we found six modules that can distinguish patients into two subgroups with significantly different survival times (log-rank test, p < 0.01). As shown in Figure 7 and Supplementary File 7, the patients with lower expressions in Module 63 and Module 270, as well as those with higher expression in Module 11, Module 25, Module 56, and Module 204 had longer survival time. Notably, Module 11 consists of lncRNAs. Collectively, these results confirm that ceRNAs play important roles in the treatment and prognosis of BRCA.

FIGURE 7
www.frontiersin.org

Figure 7. Kaplan–Meier curves of prognosis marker modules (Left) and the corresponding expression profiles (Right).

Discussion

CeRNAs play critical roles in post-transcriptional regulation and are thus involved in many physiological and pathological progresses. An increasing number of non-coding RNAs that are able to influence their ceRNA partners via the ceRNA mechanism have been detected. Computational methods are an efficient approach for the detection and analysis of ceRNA relationships. By integrating the basic hypothesis and latest studies of ceRNA, we introduced dynamic correlation LA as one of the factors for detecting ceRNA pairs. Moreover, we integrated the multi-view NMF method with conventional PCC to detect ceRNA modules. Our results indicated that LA is effective in detecting ceRNA pairs and modules. The results of subsequent analysis showed that ceRNAs play important roles in breast cancer, especially in cell adhesion, cell migration, and cell-cell communication. Our results also demonstrated that ceRNAs may represent promising drug targets and markers for the treatment and prognosis of cancers.

In multi-view NMF, the parameters λ s balances the weight of different views. We set λCOR = 1, and test λLA ∈ {1, 2, 3, 4, 5, 10, 15, 20, 25} in the BRCA dataset. The results showed that there is no significant difference among these parameters (Supplementary File 8), indicating that our method is not sensitive to these parameters.

The simplified calculation of LA demands that the distribution of the third variable follows standard normal distribution. We used the Van der Waerden's method to transform the microRNA expression profile to ensure it follows the standard normal distribution. We tested the performance of this method. We got a number of random data, which followed the negative binomial distribution. The sizes of the data set ranged from 10 to 2,000, with an increment of 20. We transform them by using Van der Waerden's method, respectively. Then we got another random data set following the standard normal distribution. whose size ranged from 1,000 to 100,000 with an increment of 1,000. We tested if the distribution of each pair between these two sets was similar by using Kolmogorov–Smirnov test. The result was shown in Supplementary File 10. The result suggested that the distribution of each pair has no significant difference.

LA is very convenient for the measurement of dynamic correlation. However, according to the results reported by Wei (Wei et al., 2019), the ceRNA strength forms a sharp peak around the most optimal microRNA concentration, indicating very large deviation of correlation. In this study, we used LA as a characteristic of ceRNA regulation. When the data contain the microRNA concentration ranging from the lower situation to higher situation, or symmetrically distributed on both sides of the most optimal microRNA concentration, the value of LA may be close to zero (according to the definition of LA). A solution to this problem is to measure the mean of the absolute value of the deviation, LA(R1,R2|MIC)=E(|g(EXPMIC)|), instead of LA(R1,R2|MIC)=E(g(EXPMIC)), where g(EXPMIC) is the deviation of correlation corresponding to EXPMIC. Moreover, LA is developed based on linear correlation PCC. However, real systems are more likely non-linear. Additional studies are warranted to determine the method for the calculation of the dynamic correlation of non-linear correlation.

Furthermore, besides the positive correlation and dynamic correlation, ceRNA pairs may have additional unknown features, which are also worthy of further investigation. The multi-view NMF framework has potential to integrate all these features for more accurate identification of ceRNA modules (Wang et al., 2018).

Materials and Methods

Datasets

We downloaded the RNA-seq data of mRNAs and lncRNAs, as well as the microRNA-seq data of patients with breast invasive carcinoma (BRCA) and liver hepatocellular carcinoma (LIHC) from TCGA (Weinstein et al., 2013). For patients with BRCA, we obtained the fragments per kilobase of exon model per million reads mapped (FPKM) values and read counts of mRNAs and lncRNAs from 1,090 tumor samples and 113 normal samples; we also obtained the read counts of mature microRNAs from 1,077 tumor samples and 104 normal samples. For patients with LIHC, we obtained the FPKM values and read counts of mRNAs and lncRNAs from 370 tumor samples and 50 normal samples; we also obtained the read counts of mature microRNAs from 371 tumor samples and 50 normal samples. Both microRNA-mRNA and microRNA-lncRNA interactions were downloaded from the mirwalk2.0 database (Dweep and Gretz, 2015), which incorporates predicted interactions curated in at least two of 13 different RNA-microRNA interaction databases, such as DIANA (Maragkakis et al., 2011) and miRDB (Wang and El Naqa, 2008). For further analysis, we downloaded the copy number variation data and DNA methylation data of patients with BRCA from TCGA.

Data Preparation

Figure 1A shows the process of data preparation. Firstly, we only included samples that have complete mRNA, lncRNA, and microRNA expression data. We obtained 1,072 disease samples and 113 normal samples of BRCA, as well as 365 disease samples and 50 normal samples of LIHC. Subsequently, we excluded lowly expressed mRNAs, lncRNAs, and microRNAs. We only retained mRNAs with FPKM values >1 in >80% of disease or normal samples, lncRNAs with FPKM values >0.8 in >50% of disease or normal samples, and microRNAs with counts per million mapped reads values >100 in >50% of disease or normal samples (Mullokandov et al., 2012). Notably, we also excluded RNAs without interactions curated in the mirwalk2.0 database. Finally, we obtained 1,128 lncRNAs, 1,1172 mRNAs, and 137 mature microRNAs in BRCA, as well as 621 lncRNAs, 8,783 mRNAs, and 138 mature microRNAs in LIHC. All retained mRNAs and lncRNAs were considered candidate ceRNAs. We transformed the expression profiles of the candidate ceRNAs through log(FPKM+1) and those of mature microRNAs through log(CPM+1).

Conventional Correlation Between RNAs

Current methods use conventional correlations as factors to predict ceRNA pairs. PCC and SI are most commonly used. We calculated the PCC value of a candidate ceRNA pair (e.g., R1, R2) as follows:

ρ(R1,R2)=E[(EXPR1-E(EXPR1))×(EXPR2-E(EXPR2))]Var(EXPR1)×Var(EXPR2)

where EXPR1 and EXPR2 represent the paired expression of R1 and R2, respectively. For the SI value of R1 and R2, we initially added the expression of shared microRNAs between R1 and R2. Subsequently, we calculated the partial correlation with respect to their shared microRNAs as follows:

ρ(R1,R2|MIC)=ρ(R1,R2)-ρ(R1,MIC)×ρ(R2,MIC)1-ρ2(R1,MIC)×1-ρ2(R2,MIC),

where MIC indicates the shared microRNAs between R1 and R2. Finally, we computed the SI value as follows:

SI(R1,R2)=ρ(R1,R2)-ρ(R1,R2|MIC)

Statistical Significance of Shared MicroRNAs Between RNAs

CeRNAs are thought to usually share numerous microRNA targets. We used the hypergeometric test to calculate the significance of shared microRNAs for a given ceRNA pair (e.g., R1 and R2). The significance p-value can be obtained as follows:

p(R1,R2)=1-i=0q-1(Ti)×(Q-Tb-i)(Qb)

where Q is the total number of considered microRNAs, T is the number of microRNAs targeting candidate R1, b is the number of microRNAs targeting candidate R2, and q is the number of microRNAs targeting both candidates R1 and R2.

Solution of the Objective Function

According to the multi-view NMF framework, we obtained the objective function as follows:

min(MPCCUPCC(VPCC)TF2+MLAULA(VLA)TF2+λPCCVPCCV*F2+λLAVLAV*F2)s.t. 1kK,U,kPCC1=1,U,kLA1=1 and UPCC,ULA,VPCC,VLA,V*0

where λPCC, λLA tunes the relative weight among different views and between the standard NMF error and disagreement among (V*)T, (VPCC)T and (VLA)T. We used an iterative update procedure by updating one variable and maintaining the remaining variables fixed to solve this optimization problem. The specific iteration rules are listed as follows:

1. fixing V*, VPCC, and VLA, updating UPCC and ULA, respectively:

Ui,kPCCUi,kPCC(MPCCVPCC)i,k+λPCCj=1NVj,kPCCVj,k*(UPCC(VPCC)TVPCC)i,k+λPCCl=1MUl,kPCCj=1N(Vj,kPCC)2Ui,kLAUi,kLA(MLAVLA)i,k+λLAj=1NVj,kLAVj,k*(ULA(VLA)TVLA)i,k+λLAl=1MUl,kLAj=1N(Vj,kLA)2

2. fixing V*, UPCC, and ULA, updating VPCC and VLA, respectively:

Vj,kPCCVj,kPCC((MPCC)TUPCC)j,k+λPCCVj,k*(VPCC(UPCC)TUPCC)j,k+λPCCVj,kPCCVj,kLAVj,kLA((MLA)TULA)j,k+λLAVj,k*(VLA(ULA)TULA)j,k+λLAVj,kv

3. fixing VPCC, VLA, UPCC, and ULA, updating V*:

V*=λPCCVPCCQPCC+λLAVLAQLAλPCC+λLA,

where

QPCC=Diag(i=1GUi,1PCC,i=1GUi,2PCC,,i=1GUi,KPCC)QLA=Diag(i=1GUi,1LA,i=1GUi,2LA,,i=1GUi,KLA)

More details can be found in Liu et al. (2013).

Disease-Associated Genes Filter

We retained the patients with both tumor and normal samples and used them to identify differentially expressed genes using the R package edgeR software (Robinson et al., 2010). We set genes with FDR < 0.05, |log FC| > 1 as differentially expressed genes.

Apart from the differentially expressed genes, we investigated the BRCA-associated copy number variation (CNV) genes and differentially methylated genes to further investigate the BRCA associated modules. This was based on the notion that the expression of these genes may be altered and reflect the ceRNA interactions. We calculated the level of copy number variation for each gene in each sample using GISTIC2.0 software (Mermel et al., 2011), and identified genes that did not equal two in >5% of samples as CNV genes.

For DNA methylation, we used Illumina 450K methylation data. We initially identified differentially methylated sites (DMS) using Limma (Ritchie et al., 2015). The sites, which showed statistical significance with a FDR < 0.05 and had mean methylation differences between disease and normal states >0.2, were marked as DMS. Subsequently, we isolated the sites located in the regions between 2,000bp upstream and downstream of the start position of ceRNA candidate genes and mapped them to related genes. For genes related to more than one DMS and more than three-quarters of these DMSs exhibit the same direction of change in methylation, we set these genes as differential methylation genes (Kim et al., 2012).

Data Availability Statement

The RNA-seq data, microRNA-seq data, CNV data and gene methylation data are from The Cancer Genome Atlas (TCGA). The source codes of LAceModule is available at https://github.com/GaoLabXDU/LAceModule.

Author Contributions

XW and LG conceived the study. XW and YH designed the experiments. XW performed the experiments and analyses. All authors discussed the results and contributed to the final manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (Grant Nos. 61532014, 61672407, and 61432010), and the National Key R&D Program of China (Grant No. 2018YFC0910400).

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2020.00235/full#supplementary-material

Supplementary Table 1. Most frequently enriched KEGG pathways.

Supplementary Table 2. Summary of BRCA-dysregulated genes.

Supplementary Table 3. Top 15 most frequently shared microRNAs in more than three modules.

Supplementary File 1. Heatmap of PCC and LA values of modules enriched for GO:0007155 in disease and normal states (top right: disease state, bottom left: normal state).

Supplementary File 2. Heatmap of PCC and LA values of modules enriched for KEGG:04514 in disease and normal states (top right: disease state, bottom left: normal state).

Supplementary File 3. Heatmap of PCC and LA values of modules enriched for KEGG:04510 in disease and normal states (top right: disease state, bottom left: normal state).

Supplementary File 4. Profiles of differentially expressed genes.

Supplementary File 5. BRCA-associated modules (Supplementary of Figure 5A).

Supplementary File 6. Kaplan–Meier curves for individual genes in Figure 6C.

Supplementary File 7. Kaplan–Meier curves of prognosis marker modules (left) and the corresponding expression profiles (right). (Supplementary of Figure 7).

Supplementary File 8. Comparison of the results of different values for the parameter λLA.

Supplementary File 9. Patients groups by the expression of let-7 family and the gene set in Figure 6.

Supplementary File 10. The evaluation of Van der Waerden's method.

References

Ai, D., Li, X., Pan, H., Chen, J., Cram, J. A., and Xia, L. C. (2019). Explore mediated co-varying dynamics in microbial community using integrated local similarity and liquid association analysis. BMC Genomics 20:185. doi: 10.1186/s12864-019-5469-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Ala, U., Karreth, F. A., Bosia, C., Pagnani, A., Taulli, R., Léopold, V., et al. (2013). Integrated transcriptional and competitive endogenous RNA networks are cross-regulated in permissive molecular environments. Proc. Natl. Acad. Sci. U.S.A. 110, 7154–7159. doi: 10.1073/pnas.1222509110

PubMed Abstract | CrossRef Full Text | Google Scholar

Alibek, K., Kakpenova, A., Mussabekova, A., Sypabekova, M., and Karatayeva, N. (2013). Role of viruses in the development of breast cancer. Infect. Agent. Cancer 8:32. doi: 10.1186/1750-9378-8-32

PubMed Abstract | CrossRef Full Text | Google Scholar

Amarante, M. K., and Watanabe, M. A. (2009). The possible involvement of virus in breast cancer. J. Cancer Res. Clin. Oncol. 135, 329–337. doi: 10.1007/s00432-008-0511-2

PubMed Abstract | CrossRef Full Text | Google Scholar

Bartel, D. P. (2009). MicroRNAs: target recognition and regulatory functions. Cell 136, 215–233. doi: 10.1016/j.cell.2009.01.002

PubMed Abstract | CrossRef Full Text | Google Scholar

Bates, J. P., Derakhshandeh, R., Jones, L., and Webb, T. J. (2018). Mechanisms of immune evasion in breast cancer. BMC Cancer 18:556. doi: 10.1186/s12885-018-4441-3

PubMed Abstract | CrossRef Full Text | Google Scholar

Brooks, M. D., and Wicha, M. S. (2015). Tumor twitter: cellular communication in the breast cancer stem cell niche. Cancer Discov. 5, 469–471. doi: 10.1158/2159-8290.CD-15-0327

PubMed Abstract | CrossRef Full Text | Google Scholar

Cavallaro, U., and Christofori, G. (2001). Cell adhesion in tumor invasion and metastasis: loss of the glue is not enough. Biochim. Biophys. Acta 1552, 39–45. doi: 10.1016/s0304-419x(01)00038-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Chiu, S. C., Chung, H. Y., Cho, D. Y., Chan, T. M., Liu, M. C., Huang, H. M., et al. (2014). Therapeutic potential of microRNA let-7: tumor suppression or impeding normal stemness. Cell Transplant. 23, 459–469. doi: 10.3727/096368914X678418

PubMed Abstract | CrossRef Full Text | Google Scholar

Chiu, Y. C., Hsiao, T. H., Chen, Y., and Chuang, E. Y. (2015). Parameter optimization for constructing competing endogenous RNA regulatory network in glioblastoma multiforme and other cancers. BMC Genomics 16 (Suppl. 4):S1. doi: 10.1186/1471-2164-16-S4-S1

PubMed Abstract | CrossRef Full Text | Google Scholar

Chou, C. H., Chang, N. W., Shrestha, S., Hsu, S. D., Lin, Y. L., Lee, W. H., et al. (2016). miRTarBase 2016: updates to the experimentally validated miRNA-target interactions database. Nucleic Acids Res. 44, D239–247. doi: 10.1093/nar/gkv1258

PubMed Abstract | CrossRef Full Text | Google Scholar

Cunningham, H. D., Shannon, L. A., Calloway, P. A., Fassold, B. C., Dunwiddie, I., Vielhauer, G., et al. (2010). Expression of the C-C chemokine receptor 7 mediates metastasis of breast cancer to the lymph nodes in mice. Transl. Oncol. 3, 354–361. doi: 10.1593/tlo.10178

PubMed Abstract | CrossRef Full Text | Google Scholar

Do, D., and Bozdag, S. (2018). Cancerin: a computational pipeline to infer cancer-associated ceRNA interaction networks. PLoS Comput. Biol. 14:e1006318. doi: 10.1371/journal.pcbi.1006318

PubMed Abstract | CrossRef Full Text | Google Scholar

Dweep, H., and Gretz, N. (2015). miRWalk2.0: a comprehensive atlas of microRNA-target interactions. Nat. Methods 12:697. doi: 10.1038/nmeth.3485

PubMed Abstract | CrossRef Full Text | Google Scholar

Figliuzzi, M., Marinari, E., and De Martino, A. (2013). MicroRNAs as a selective channel of communication between competing RNAs: a steady-state theory. Biophys. J. 104, 1203–1213. doi: 10.1016/j.bpj.2013.01.012

PubMed Abstract | CrossRef Full Text | Google Scholar

Fry, S. A., Robertson, C. E., Swann, R., and Dwek, M. V. (2016). Cadherin-5: a biomarker for metastatic breast cancer with optimum efficacy in oestrogen receptor-positive breast cancers with vascular invasion. Br. J. Cancer 114, 1019–1026. doi: 10.1038/bjc.2016.66

PubMed Abstract | CrossRef Full Text | Google Scholar

Gendelman, R., Xing, H., Mirzoeva, O. K., Sarde, P., Curtis, C., Feiler, H. S., et al. (2017). Bayesian network inference modeling identifies TRIB1 as a novel regulator of cell-cycle progression and survival in cancer cells. Cancer Res. 77, 1575–1585. doi: 10.1158/0008-5472.CAN-16-0512

PubMed Abstract | CrossRef Full Text | Google Scholar

Gunderson, T., and Ho, Y. Y. (2014). An efficient algorithm to explore liquid association on a genome-wide scale. BMC Bioinform. 15:371. doi: 10.1186/s12859-014-0371-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Hanahan, D., and Weinberg, R. A. (2011). Hallmarks of cancer: the next generation. Cell 144, 646–674. doi: 10.1016/j.cell.2011.02.013

PubMed Abstract | CrossRef Full Text | Google Scholar

Hu, J., Li, X., Guo, X., Guo, Q., Xiang, C., Zhang, Z., et al. (2017). The CCR2 3'UTR functions as a competing endogenous RNA to inhibit breast cancer metastasis. J. Cell Sci. 130, 3399–3413. doi: 10.1242/jcs.202127

PubMed Abstract | CrossRef Full Text | Google Scholar

Hubert, L., and Schultz, J. (1976). Quadratic assignment as a general data analysis strategy. Brit. J. Math. Stat. Psy. 29, 190–241. doi: 10.1111/j.2044-8317.1976.tb00714.x

CrossRef Full Text | Google Scholar

Jiang, Y., Pan, Y., Rhea, P. R., Tan, L., Gagea, M., Cohen, L., et al. (2016). A sucrose-enriched diet promotes tumorigenesis in mammary gland in part through the 12-lipoxygenase pathway. Cancer Res. 76, 24–29. doi: 10.1158/0008-5472.CAN-14-3432

PubMed Abstract | CrossRef Full Text | Google Scholar

Johnsson, P., Ackley, A., Vidarsdottir, L., Lui, W. O., Corcoran, M., Grandér, D., et al. (2013). A pseudogene long-noncoding-RNA network regulates PTEN transcription and translation in human cells. Nat. Struct. Mol. Biol. 20, 440–446. doi: 10.1038/nsmb.2516

PubMed Abstract | CrossRef Full Text | Google Scholar

Kakolyris, S., Fox, S. B., Koukourakis, M., Giatromanolaki, A., Brown, N., Leek, R. D., et al. (2000). Relationship of vascular maturation in breast cancer blood vessels to vascular density and metastasis, assessed by expression of a novel basement membrane component, LH39. Br. J. Cancer 82, 844–851. doi: 10.1054/bjoc.1999.1010

PubMed Abstract | CrossRef Full Text | Google Scholar

Karreth, F. A., and Pandolfi, P. P. (2013). ceRNA Cross-talk in cancer: when ce-bling rivalries go awry. Cancer Discov. 3, 1113–1121. doi: 10.1158/2159-8290.CD-13-0202

PubMed Abstract | CrossRef Full Text | Google Scholar

Kim, D. H., and Wirtz, D. (2013). Focal adhesion size uniquely predicts cell migration. FASEB J. 27, 1351–1361. doi: 10.1096/fj.12-220160

PubMed Abstract | CrossRef Full Text | Google Scholar

Kim, J. W., Kim, S. T., Turner, A. R., Young, T., Smith, S., Liu, W., et al. (2012). Identification of new differentially methylated genes that have potential functional consequences in prostate cancer. PLoS ONE 7:e48455. doi: 10.1371/journal.pone.0048455

PubMed Abstract | CrossRef Full Text | Google Scholar

Lawson, J. S., Salmons, B., and Glenn, W. K. (2018). Oncogenic viruses and breast cancer: mouse mammary tumor virus (MMTV), bovine leukemia virus (BLV), human papilloma virus (HPV), and epstein–barr virus (EBV). Front. Oncol. 8:1. doi: 10.3389/fonc.2018.00001

PubMed Abstract | CrossRef Full Text | Google Scholar

Le, T. D., Zhang, J., Liu, L., and Li, J. (2017). Computational methods for identifying miRNA sponge interactions. Brief. Bioinform. 18, 577–590. doi: 10.1093/bib/bbw042

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, K. C. (2002). Genome-wide coexpression dynamics: theory and application. Proc. Natl. Acad. Sci. U.S.A. 99, 16875–16880. doi: 10.1073/pnas.252466999

PubMed Abstract | CrossRef Full Text | Google Scholar

Li, K. C., Palotie, A., Yuan, S., Bronnikov, D., Chen, D., Wei, X., et al. (2007). Finding disease candidate genes by liquid association. Genome Biol. 8:R205. doi: 10.1186/gb-2007-8-10-r205

PubMed Abstract | CrossRef Full Text | Google Scholar

List, M., Dehghani Amirabad, A., Kostka, D., and Schulz, M. H. (2019). Large-scale inference of competing endogenous RNA networks with sparse partial correlation. Bioinformatics 35, i596–i604. doi: 10.1093/bioinformatics/btz314

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, J., Wang, C., Gao, J., and Han, J. (2013). “Multi-view clustering via joint nonnegative matrix factorization,” in Proceedings of the 2013 SIAM International Conference on Data Mining: SIAM (Austin, TX), 252–260. doi: 10.1137/1.9781611972832.28

CrossRef Full Text | Google Scholar

Liu, Y., Sun, J., and Zhao, M. (2017). ONGene: a literature-based database for human oncogenes. J. Genet. Genomics 44, 119–121. doi: 10.1016/j.jgg.2016.12.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Maragkakis, M., Vergoulis, T., Alexiou, P., Reczko, M., Plomaritou, K., Gousis, M., et al. (2011). DIANA-microT web server upgrade supports fly and worm miRNA target prediction and bibliographic miRNA to disease association. Nucleic Acids Res. 39, W145–W148. doi: 10.1093/nar/gkr294

PubMed Abstract | CrossRef Full Text | Google Scholar

Martirosyan, A., Del Giudice, M., Bena, C. E., Pagnani, A., Bosia, C., and De Martino, A. (2019). Kinetic modelling of competition and depletion of shared miRNAs by competing endogenous RNAs. Methods Mol. Biol. 1912, 367–409. doi: 10.1007/978-1-4939-8982-9_15

PubMed Abstract | CrossRef Full Text | Google Scholar

McClain, J. O., and Rao, V. R. (1975). Clustisz: a program to test for the quality of clustering of a set of objects. J. Mark. Res. 12, 456–460.

Google Scholar

Mermel, C. H., Schumacher, S. E., Hill, B., Meyerson, M. L., Beroukhim, R., and Getz, G. (2011). GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Genome Biol. 12:R41. doi: 10.1186/gb-2011-12-4-r41

PubMed Abstract | CrossRef Full Text | Google Scholar

Milligan, G. W. (1981). A monte carlo study of thirty internal criterion measures for cluster analysis. Psychometrika 46, 187–199. doi: 10.1007/BF02293899

CrossRef Full Text | Google Scholar

Mullokandov, G., Baccarini, A., Ruzo, A., Jayaprakash, A. D., Tung, N., Israelow, B., et al. (2012). High-throughput assessment of microRNA activity and function using microRNA sensor and decoy libraries. Nat. Methods 9, 840–846. doi: 10.1038/nmeth.2078

PubMed Abstract | CrossRef Full Text | Google Scholar

Okegawa, T., Pong, R. C., Li, Y., and Hsieh, J. T. (2004). The role of cell adhesion molecule in cancer progression and its application in cancer therapy. Acta Biochim. Pol. 51, 445–457. doi: 10.18388/abp.2004_3583

PubMed Abstract | CrossRef Full Text | Google Scholar

Otto, T., and Sicinski, P. (2017). Cell cycle proteins as promising targets in cancer therapy. Nat. Rev. Cancer 17, 93–115. doi: 10.1038/nrc.2016.138

PubMed Abstract | CrossRef Full Text | Google Scholar

Paci, P., Colombo, T., and Farina, L. (2014). Computational analysis identifies a sponge interaction network between long non-coding RNAs and messenger RNAs in human breast cancer. BMC Syst. Biol. 8:83. doi: 10.1186/1752-0509-8-83

PubMed Abstract | CrossRef Full Text | Google Scholar

Pian, C., Zhang, G., Tu, T., Ma, X., and Li, F. (2018). LncCeRBase: a database of experimentally validated human competing endogenous long non-coding RNAs. Database 2018:bay061. doi: 10.1093/database/bay061

PubMed Abstract | CrossRef Full Text | Google Scholar

Poliseno, L., Haimovic, A., Christos, P. J., Vega y Saenz de Miera, E. C., Shapiro, R., Pavlick, A., et al. (2011). Deletion of PTENP1 pseudogene in human melanoma. J. Invest. Dermatol. 131, 2497–2500. doi: 10.1038/jid.2011.232

PubMed Abstract | CrossRef Full Text | Google Scholar

Poliseno, L., and Pandolfi, P. P. (2015). PTEN ceRNA networks in human cancer. Methods 77–78, 41–50. doi: 10.1016/j.ymeth.2015.01.013

CrossRef Full Text | Google Scholar

Poliseno, L., Salmena, L., Zhang, J., Carver, B., Haveman, W. J., and Pandolfi, P. P. (2010). A coding-independent function of gene and pseudogene mRNAs regulates tumour biology. Nature 465, 1033–1038. doi: 10.1038/nature09144

PubMed Abstract | CrossRef Full Text | Google Scholar

Qi, X., Zhang, D. H., Wu, N., Xiao, J. H., Wang, X., and Ma, W. (2015). ceRNA in cancer: possible functions and clinical implications. J. Med. Genet. 52, 710–718. doi: 10.1136/jmedgenet-2015-103334

PubMed Abstract | CrossRef Full Text | Google Scholar

Reimand, J., Kull, M., Peterson, H., Hansen, J., and Vilo, J. (2007). g:Profiler–a web-based toolset for functional profiling of gene lists from large-scale experiments. Nucleic Acids Res. 35, W193–W200. doi: 10.1093/nar/gkm226

CrossRef Full Text | Google Scholar

Ridley, A. J., Schwartz, M. A., Burridge, K., Firtel, R. A., Ginsberg, M. H., Borisy, G., et al. (2003). Cell migration: integrating signals from front to back. Science 302, 1704–1709. doi: 10.1126/science.1092053

PubMed Abstract | CrossRef Full Text | Google Scholar

Ritchie, M. E., Phipson, B., Wu, D., Hu, Y., Law, C. W., Shi, W., et al. (2015). Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43:e47. doi: 10.1093/nar/gkv007

PubMed Abstract | CrossRef Full Text | Google Scholar

Robinson, M. D., McCarthy, D. J., and Smyth, G. K. (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140. doi: 10.1093/bioinformatics/btp616

PubMed Abstract | CrossRef Full Text | Google Scholar

Rousseeuw, P. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65. doi: 10.1016/0377-0427(87)90125-7

CrossRef Full Text | Google Scholar

Salmena, L., Poliseno, L., Tay, Y., Kats, L., and Pandolfi, P. P. (2011). A ceRNA hypothesis: the rosetta stone of a hidden RNA language? Cell 146, 353–358. doi: 10.1016/j.cell.2011.07.014

PubMed Abstract | CrossRef Full Text | Google Scholar

Schmitt, A. M., and Chang, H. Y. (2016). Long noncoding RNAs in cancer pathways. Cancer Cell 29, 452–463. doi: 10.1016/j.ccell.2016.03.010

PubMed Abstract | CrossRef Full Text | Google Scholar

Stein, C. M. (1981). Estimation of the mean of a multivariate normal distribution. Ann. Stat. 9, 1135–1151. doi: 10.1214/aos/1176345632

CrossRef Full Text | Google Scholar

Strehl, A., and Ghosh, J. (2003). Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617. doi: 10.1162/153244303321897735

CrossRef Full Text | Google Scholar

Sumazin, P., Yang, X., Chiu, H. S., Chung, W. J., Iyer, A., Llobet-Navas, D., et al. (2011). An extensive microRNA-mediated network of RNA-RNA interactions regulates established oncogenic pathways in glioblastoma. Cell 147, 370–381. doi: 10.1016/j.cell.2011.09.041

PubMed Abstract | CrossRef Full Text | Google Scholar

Tay, Y., Rinn, J., and Pandolfi, P. P. (2014). The multilayered complexity of ceRNA crosstalk and competition. Nature 505, 344–352. doi: 10.1038/nature12986

PubMed Abstract | CrossRef Full Text | Google Scholar

Thomson, D. W., and Dinger, M. E. (2016). Endogenous microRNA sponges: evidence and controversy. Nat. Rev. Genet. 17, 272–283. doi: 10.1038/nrg.2016.20

PubMed Abstract | CrossRef Full Text | Google Scholar

Tomita, Y., Dorward, H., Yool, A. J., Smith, E., Townsend, A. R., Price, T. J., et al. (2017). Role of aquaporin 1 signalling in cancer development and progression. Int. J. Mol. Sci. 18:99. doi: 10.3390/ijms18020299

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, B., Liu, M., Song, Y., Li, C., Zhang, S., and Ma, L. (2019). KLF2 inhibits the migration and invasion of prostate cancer cells by downregulating MMP2. Am. J. Mens. Health 13:1557988318816907. doi: 10.1177/1557988318816907

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, H. G., Cao, B., Zhang, L. X., Song, N., Li, H., Zhao, W. Z., et al. (2017). KLF2 inhibits cell growth via regulating HIF-1alpha/Notch-1 signal pathway in human colorectal cancer HCT116 cells. Oncol. Rep. 38, 584–590. doi: 10.3892/or.2017.5708

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, J., Liu, X., Wu, H., Ni, P., Gu, Z., Qiao, Y., et al. (2010). CREB up-regulates long non-coding RNA, HULC expression through interaction with microRNA-372 in liver cancer. Nucleic Acids Res. 38, 5366–5383. doi: 10.1093/nar/gkq285

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, P., Gao, L., Hu, Y., and Li, F. (2018). Feature related multi-view nonnegative matrix factorization for identifying conserved functional modules in multiple biological networks. BMC Bioinform. 19:394. doi: 10.1186/s12859-018-2434-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, P., Ning, S., Zhang, Y., Li, R., Ye, J., Zhao, Z., et al. (2015a). Identification of lncRNA-associated competing triplets reveals global patterns and prognostic markers for cancer. Nucleic Acids Res. 43, 3478–3489. doi: 10.1093/nar/gkv233

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, P., Zhi, H., Zhang, Y., Liu, Y., Zhang, J., Gao, Y., et al. (2015b). miRSponge: a manually curated database for experimentally supported miRNA sponges and ceRNAs. Database. 2015:bav098. doi: 10.1093/database/bav098

PubMed Abstract | CrossRef Full Text | Google Scholar

Wang, X., and El Naqa, I. M. (2008). Prediction of both conserved and nonconserved microRNA targets in animals. Bioinformatics 24, 325–332. doi: 10.1093/bioinformatics/btm595

PubMed Abstract | CrossRef Full Text | Google Scholar

Wei, L., Yuan, Y., Hu, T., Li, S., Cheng, T., Lei, J., et al. (2019). Regulation by competition: a hidden layer of gene regulatory network. J. Quant. Biol. 7, 110–121. doi: 10.1007/s40484-018-0162-5

CrossRef Full Text | Google Scholar

Weinstein, J. N., Collisson, E. A., Mills, G. B., Shaw, K. R. M., Ozenberger, B. A., Ellrott, K., et al. (2013). The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45, 1113–1120. doi: 10.1038/ng.2764

PubMed Abstract | CrossRef Full Text | Google Scholar

Yang, J., Li, T., Gao, C., Lv, X., Liu, K., Song, H., et al. (2014). FOXO1 3'UTR functions as a ceRNA in repressing the metastases of breast cancer cells via regulating miRNA activity. FEBS Lett. 588, 3218–3224. doi: 10.1016/j.febslet.2014.07.003

PubMed Abstract | CrossRef Full Text | Google Scholar

Yang, J., Qin, Y., Zhang, T., Wang, F., Peng, L., Zhu, L., et al. (2018). Identification of human age-associated gene co-expressions in functional modules using liquid association. Oncotarget 9, 1063–1074. doi: 10.18632/oncotarget.23148

PubMed Abstract | CrossRef Full Text | Google Scholar

Yates, L. A., Norbury, C. J., and Gilbert, R. J. (2013). The long and short of microRNA. Cell 153, 516–519. doi: 10.1016/j.cell.2013.04.003

PubMed Abstract | CrossRef Full Text | Google Scholar

Yu, G., Yao, W., Gumireddy, K., Li, A., Wang, J., Xiao, W., et al. (2014). Pseudogene PTENP1 functions as a competing endogenous RNA to suppress clear-cell renal cell carcinoma progression. Mol. Cancer Ther. 13, 3086–3097. doi: 10.1158/1535-7163.MCT-14-0245

PubMed Abstract | CrossRef Full Text | Google Scholar

Yu, T. (2018). A new dynamic correlation algorithm reveals novel functional aspects in single cell and bulk RNA-seq data. PLoS Comput. Biol. 14:e1006391. doi: 10.1371/journal.pcbi.1006391

PubMed Abstract | CrossRef Full Text | Google Scholar

Yuan, Y., Liu, B., Xie, P., Zhang, M. Q., Li, Y., Xie, Z., et al. (2015). Model-guided quantitative analysis of microRNA-mediated regulation on competing endogenous RNAs using a synthetic gene circuit. Proc. Natl. Acad. Sci. U.S.A. 112, 3158–3163. doi: 10.1073/pnas.1413896112

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, J., Liu, L., Li, J., and Le, T. D. (2018). LncmiRSRN: identification and analysis of long non-coding RNA related miRNA sponge regulatory network in human cancer. Bioinformatics 34, 4232–4240. doi: 10.1093/bioinformatics/bty525

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, W., Levi, L., Banerjee, P., Jain, M., and Noy, N. (2015). Kruppel-like factor 2 suppresses mammary carcinoma growth by regulating retinoic acid signaling. Oncotarget 6, 35830–35842. doi: 10.18632/oncotarget.5767

PubMed Abstract | CrossRef Full Text | Google Scholar

Zheng, L., Zhang, Z., Zhang, S., Guo, Q., Zhang, F., Gao, L., et al. (2018). RNA binding protein RNPC1 inhibits breast cancer cell metastasis via activating STARD13-correlated ceRNA network. Mol. Pharm. 15, 2123–2132. doi: 10.1021/acs.molpharmaceut.7b01123

PubMed Abstract | CrossRef Full Text | Google Scholar

Zheng, Q., Bao, C., Guo, W., Li, S., Chen, J., Chen, B., et al. (2016). Circular RNA profiling reveals an abundant circHIPK3 that regulates cell growth by sponging multiple miRNAs. Nat. Commun. 7:11215. doi: 10.1038/ncomms11215

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhong, Y., Du, Y., Yang, X., Mo, Y., Fan, C., Xiong, F., et al. (2018). Circular RNAs function as ceRNAs to regulate and control human cancer progression. Mol. Cancer. 17:79. doi: 10.1186/s12943-018-0827-8

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhou, X., Liu, J., and Wang, W. (2014). Construction and investigation of breast-cancer-specific ceRNA network based on the mRNA and miRNA expression data. IET Syst. Biol. 8, 96–103. doi: 10.1049/iet-syb.2013.0025

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: ceRNA, microRNA, correlation, liquid association, modules

Citation: Wen X, Gao L and Hu Y (2020) LAceModule: Identification of Competing Endogenous RNA Modules by Integrating Dynamic Correlation. Front. Genet. 11:235. doi: 10.3389/fgene.2020.00235

Received: 11 December 2019; Accepted: 27 February 2020;
Published: 18 March 2020.

Edited by:

Jie Sun, Wenzhou Medical University, China

Reviewed by:

Quan Zou, University of Electronic Science and Technology of China, China
Fengbiao Mao, University of Michigan, United States

Copyright © 2020 Wen, Gao and Hu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Lin Gao, lgao@mail.xidian.edu.cn

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.