Abstract
Basal and luminal subtypes of muscle-invasive bladder cancer (MIBC) have distinct molecular profiles and heterogeneous clinical behaviors. The interactions between mRNAs and lncRNAs, which might be regulated by miRNAs, have crucial roles in many cancers. However, the miRNA-dependent crosstalk between lncRNA and mRNA in specific MIBC subtypes still remains unclear. In this study, we first classified MIBC into two conservative subtypes using miRNA, mRNA and lncRNA expression data derived from The Cancer Genome Atlas. Then we investigated subtype-related biological pathways and evaluated the subtype classification performance using Decision Trees, Random Forest and eXtreme Gradient Boosting (XGBoost). At last, we explored potential miRNA-mediated lncRNA-mRNA crosstalks based on co-expression analysis. Our results show that: (1) the luminal subtype is primarily characterized by upregulation of metabolism-related pathways while the basal subtype is predominantly characterized by upregulation of epithelial-mesenchymal transition, metastasis, and immune system process-related pathways; (2) the XGBoost prediction model is consistently robust for classification of the molecular subtypes of MIBC across four datasets (The area under the ROC curve > 0.9); (3) the expression levels of the molecules in the miR-200c and miR141-mediated lncRNA-mRNA crosstalks differ considerably between the two subtypes and have close relationships with the prognosis of MIBC. The miR-200c and miR-141-dependent mRNA-lncRNA crosstalks might be of great significance in tumorigenesis and tumor progression and may serve as the novel prognostic predictors and classification markers of MIBC subtypes.
Introduction
Urothelial bladder cancer (UBC) is one of the most common malignant tumors of urinary system. UBC can generally be classified into non-muscle-invasive bladder cancer (NMIBC) and muscle-invasive bladder cancer (MIBC), according to whether the cancer cells are restricted locally in the lamina propria or invade the muscularis propria (). A great number of studies have reported that according to shared RNA expression patterns or specific genomic alterations MIBC can be further classified into two major subtypes, namely basal and luminal (; ; ,; ; ; ), which are strikingly similar to the molecular subtypes first described in breast cancer (; ). The basal subtype has drawn much attention because it is associated with a more aggressive phenotype and has a higher risk of distant metastasis than luminal subtype (; ). One reason for the difference is that the two subtypes develop from etiologically different pathways. Pathways that are involved in EMT and immune-associated pathways are upregulated in the basal subtype (). The molecular biomarkers and pathways involved in MIBC subtypes are the key to understanding its subtype heterogeneity and identifying subtype-specific biomarkers that can be used to better manage MIBC patients.
MicroRNAs (miRNAs) represent one of the most exciting areas of modern medical and biological sciences as they can modulate an immense and complex regulatory network of gene expression in a broad spectrum of developmental and cellular processes, such as cell proliferation, metabolism, apoptosis, and viral infection (; ; ; ; ; ). miRNAs not only have a well-established inhibitory effect on gene expression but also promote gene expression in some cases (; ), and long non-coding RNAs (lncRNAs) exhibit facilitative or suppressive effects on the gene regulatory network during tumor development (; ). Furthermore, aberration or perturbation in miRNA-mediated mRNA and lncRNA expression levels has a significant correlation with serious clinical consequences, including diseases of diverse origins and malignancy (; ; ; ; ; ).
Regarding molecular drivers of cancer development, oncogenic mutations and downstream signaling pathways in the pre-cancerous or cancerous cell have been thought to play a crucial role in the cancer formation and progression. In addition, recent studies have shown that metabolic reprogramming plays much more important roles than previously thought in cancer development (). It is possible that a great number of genomic mutations detected in cancer provide a selective advantage for the cancer cell in the stressful tumor microenvironment by reprogramming cell metabolic processes (). No matter what is the primary cause of cancer development, it is clear that both the oncogenic signaling and reprogrammed metabolisms involve numerous genes, working in a concerted manner in a complex network. Gene regulatory network-based view can, therefore, provide a deeper insight into the cancer development.
The aim of this study is to identify subtype-specific dysregulated miRNA-mediated mRNA-lncRNA interactions and discover new critical subtype-related genes in MIBC.
Materials and Methods
Data Acquisition and Pre-processing
The MIBC RNA-Seq (FPKM) and clinical data were obtained from The Cancer Genome Atlas (TCGA) public data portal1, and miRNA-Seq (RPM) data was downloaded from the Broad GDAC Firehose2. The gene expression datasets of 403 tumor samples and 19 adjacent normal tissue samples contain 19181 mRNAs, 14376 lncRNAs, and 2588 mature miRNAs. The microarray datasets (GSE32894, GSE13507, and GSE31684) derived from Gene Expression Omnibus (GEO) were used to evaluate the performance of classifiers and verify the prognostic use of marker genes3.
Clustering Analysis and Gene Set Enrichment Analysis
Consensus clustering () is a method that provides quantitative evidence for determining the number and membership of possible clusters within a dataset, such as RNA-Seq and microarray. For CC analysis, the RPKM gene expression data was pre-processed to detect the most highly expressed and variable genes across samples. We removed 25% genes that have the low arithmetic mean of the given gene across samples. Then the MAD was used to select the most highly expressed and variable 3,000 mRNAs, 300 miRNAs, and 3,000 lncRNAs. CC available in the R package “ConsensusClusterPlus” was performed on 3,000 mRNAs, 300 miRNAs, and 3,000 lncRNAs with 403 tumor samples, using the following key parameters: reps = 50, innerLinkage = complete, clusterAlg = hc, k = 6, and distance = pearson ().
Cluster of cluster analysis is a method of integrating the primary clustering results into final cluster assignments. Each sample is represented as a binary vector, whose length is ∑ i=1tKi (where t is the number of datasets and Ki is the number of clusters for dataset (i), to implement subsequent clustering analysis. We first conducted the CoC analysis on the clustering results of mRNA, miRNA, and lncRNA dataset to obtain a binary dataset. The CC was once more performed on the binary dataset for generating final clusters. Number of final clusters (K) was estimated by commonly used methods including ASW, CPCC, Relative Change in Area under Cumulative density function [Δ(K)], and PAC (Şenbabaoglu et al., 2014).
In order to explore subtype-associated biological processes, GSEA () was conducted using three gene set datasets (GO-BP, KEGG, and Hallmark gene sets]. The following parameters were taken for GSEA: Number of permutations = 1000, Permutation type = gene_set, Enrichment statistic = weighted, Metric for ranking genes = Signal2Noise.
Differentially Expressed Genes and Machine Learning
“Ballgown” (R package) was used to identify DEGs between tumor and normal samples (). F-test was used in “Ballgown”, and DEGs here were defined as those with FDR adjusted p-value < 0.05 (Benjamini–Hochberg method) and |log2fold change| > 0.57).
Three tree-based machine learning methods, namely DTs, RF, and eXtreme Gradient Boosting (XGBoost or XG), were performed on 3000 mRNAs, 300 miRNAs, and 3000 lncRNAs for MIBC subtype classification. The area under the ROC curve (AUC) was used to estimate the performance of the classification methods. For each classification method, MIBC samples were randomly divided into training (60%) and testing (40%). We performed RF with different parameter values of ntree and mtry, and used 10-fold cross-validation to acquire the mean accuracy. XGBoost was implemented with the following parameters: gamma = 1, min_child_weight = 1, max_depth = 14, nrounds = 2000. In order to optimize the parameter iter (number of iterations) of XGBoost, we obtained 10-fold cross-validation performance for each iter and selected the iter value that generated the best performance. For DTs, the following parameters were taken: minCases = 20 and CF = 0.25. Moreover, the well-performed classifiers in this study were trained on the TCGA-derived RNA expression data and were tested on the GSE32894 to further evaluate their performance. All machine learning methods were implemented using R packages including “C5.0”, “randomForest”, and “XGBoost” packages (; ; ).
The overlap between the feature genes obtained by the well-performed classifiers and DEGs was referred to as DEFGs. GO enrichment analysis available in the R package “clusterProfiler” was performed on DEFGs to identify their enriched GO terms (). A multiple-test correction was done using the method proposed by Benjamini and Hochberg, in which an adjusted p-value < 0.05 was considered to represent statistical significance.
Construction of a Subtype-Related mRNA-miRNA-lncRNA Network
Pairwise Pearson’s correlation analysis was carried out on the DEFGs. The lncRNA-miRNA pairs, miRNA-mRNA pairs, and lncRNA-mRNA pairs with |r| > = 0.4 and p-value < 0.05 were considered to be co-expressed gene pairs. If both elements in a co-expressed lncRNA-mRNA pair are simultaneously co-expressed with a miRNA, it is defined as a miRNA-dependent lncRNA-mRNA co-expressed interaction. A miRNA-dependent lncRNA-mRNA network was established using Cytoscape software (version 3.5.1). miRWalk2.0 () is an integration of six widely used databases (miRWalk, miRanda, miRDB, miRNAMap, RNA22, and Targetscan) and supplies the biggest available collection of predicted and experimentally verified miRNA-target interactions. Our inferred co-expressed interactions including mRNA-miRNA and lncRNA-miRNA interactions were compared to those derived from miRWalk2.0. An mRNA is considered to be a true target of miRNA if their interaction occurs in at least four databases, and an lncRNA is considered to be a true target of miRNA if their interaction is supported in at least one database among miRWalk, miRanda, and Targetscan.
Survival Analysis
We further assessed whether the genes in the inferred interactions are correlated with the overall survival of MIBC patients. Based on the mean expression level of the genes, patient samples were divided into high and low expression groups. We performed survival analysis available in R package “survival” () using the Kaplan–Meier curve (K–M curve) method. A log-rank test was used to compare survival times between two groups, and p < 0.05 was considered to represent the statistical significance.
Results
Clustering Analysis and GSEA
We first performed the CC on mRNA, miRNA, and lncRNA expression datasets to obtain the clustering results. By applying the CoC analysis to the clustering outcomes of CC, a binary dataset was obtained, which was referred to as CoC dataset. The CC was once again performed on the CoC dataset to generate the different Ks, and the ASW, CPCC, ΔK, and PAC were used to evaluate the optimal K (Supplementary Figure S1). As a result, for the CoC dataset, ASW evaluation suggests the optimal K of 6 and CPCC, ΔK, and PAC indicate the optimal K of 2. Given that K = 2 is the consistent optimal value, we chose K = 2 as a solution, dividing MIBC samples into two subtypes, namely subtype-1 and subtype-2. The hierarchically clustered heatmap of K = 2 for CoC dataset was shown in Figure 1A. Survival curves regarding two subtypes were plotted using the K-M method. Our results have shown that 5-year overall survival rate with regard to subtype-1 is 55% and 30% for subtype-2, indicating that they differ considerably in clinical prognosis (Figure 1B, p < 0.01). The heatmap depicting basal biomarkers, luminal biomarkers, and clinical indicators for the two subtypes was shown in the Figure 1C. The subtype-1 is characterized by the high expression of luminal markers such as CYP2J2, ERBB2, and KRT18, while the subtype 2 is characterized by high expression of basal markers such as CD44, CDH3, and KRT1. The Pearson’s chi-squared test is utilized to compare clinical indicators between the two subtypes. The histology, stage, grade, and status are significantly different between the two subtypes, and gender almost differs between the two subtypes (Supplementary Table S1). The subtype-1 and subtype-2 resemble the luminal and basal subtype, respectively, in terms of K–M curves, biomarkers, and clinical indicators, therefore, which were redefined as luminal and basal subtypes ().
FIGURE 1
Gene set enrichment analysis was done for the basal and luminal subtypes, and the results were shown in Tables 1, 2. Upregulated pathways in luminal subtype are mainly involved in metabolism (e.g., oxidative phosphorylation, cytochrome P450, and fatty acid metabolism) (Table 1). Whereas, upregulated pathways in the basal subtype are principally related to immune system process (e.g., extracellular structure organization, allograft rejection, mTORC1 signaling, and TNF-a signaling via NF-kB), metastasis, and EMT (Table 2).
Table 1
| Gene set name | Size | NES | FDR q-value |
|---|---|---|---|
| GO-BP | |||
| GO monocarboxylic acid catabolic process | 95 | 2.5151 | 0 |
| GO oxidative phosphorylation | 73 | 2.4333 | 0 |
| GO fatty acid catabolic process | 80 | 2.4284 | 0 |
| GO fatty acid beta oxidation | 51 | 2.3211 | 0 |
| GO electron transport chain | 78 | 2.3116 | 0 |
| GO organic acid catabolic process | 92 | 2.2781 | 2.26E-04 |
| GO mitochondrial respiratory chain complex assembly | 42 | 2.1788 | 0.0014 |
| GO lipid oxidation | 63 | 2.1586 | 0.0015 |
| GO mitochondrial respiratory chain complex i biogenesis | 199 | 2.1449 | 0.0018 |
| GO establishment of protein localization to endoplasmic reticulum | 70 | 2.1285 | 0.0022 |
| KEGG | |||
| KEGG ribosome | 87 | 2.3289 | 0 |
| KEGG alpha linolenic acid metabolism | 19 | 2.0770 | 5.29E-04 |
| KEGG metabolism of xenobiotics by cytochrome p450 | 68 | 2.0424 | 5.36E-04 |
| KEGG valine leucine and isoleucine degradation | 44 | 1.9767 | 0.00137 |
| KEGG drug metabolism cytochrome p450 | 70 | 1.9727 | 0.00120 |
| KEGG oxidative phosphorylation | 116 | 1.9372 | 0.00209 |
| KEGG peroxisome | 78 | 1.9320 | 0.00214 |
| KEGG fatty acid metabolism | 42 | 1.9184 | 0.00229 |
| KEGG retinol metabolism | 63 | 1.8697 | 0.00391 |
| KEGG linoleic acid metabolism | 29 | 1.8393 | 0.00482 |
| Hallmark gene sets | |||
| Hallmark oxidative phosphorylation | 198 | 1.5145 | 0.05830 |
| Hallmark bile acid metabolism | 112 | 1.4110 | 0.07668 |
| Hallmark peroxisome | 103 | 1.4095 | 0.05174 |
| Hallmark adipogenesis | 191 | 1.3794 | 0.05125 |
| Hallmark fatty acid metabolism | 156 | 1.2596 | 0.11892 |
Top-ranked terms of GO-BP, KEGG and Hallmark gene sets for the luminal subtype.
NES, normalized enrichment score; GO-BP, Gene Ontology Biological Process; KEGG, Kyoto Encyclopedia of Genes and Genomes. Size is the number of genes in the gene set. A positive NES means that genes over-represented in the gene set are upregulated in luminal subtypes.
Table 2
| Gene set name | Size | NES | FDR q-value |
|---|---|---|---|
| GO-BP | |||
| GO extracellular structure organization | 297 | –2.8256 | 0 |
| GO antigen processing and presentation of exogenous peptide antigen via mhc class i | 65 | –2.7258 | 0 |
| GO antigen processing and presentation | 206 | –2.6334 | 0 |
| GO antigen processing and presentation of peptide antigen | 170 | –2.6246 | 0 |
| GO antigen processing and presentation of peptide antigen via mhc class i | 90 | –2.6134 | 0 |
| GO chondroitin sulfate biosynthetic process | 25 | –2.6008 | 0 |
| GO collagen fibril organization | 36 | –2.5958 | 0 |
| GO regulation of innate immune response | 349 | –2.5825 | 0 |
| GO positive regulation of defense response | 360 | –2.5802 | 0 |
| GO cytokine mediated signaling pathway | 440 | –2.5675 | 0 |
| KEGG | |||
| KEGG focal adhesion | 197 | –2.6862 | 0 |
| KEGG cytokine cytokine receptor interaction | 257 | –2.5127 | 0 |
| KEGG ecm receptor interaction | 84 | –2.512 | 0 |
| KEGG proteasome | 43 | –2.4802 | 0 |
| KEGG leishmania infection | 69 | –2.4718 | 0 |
| KEGG viral myocarditis | 68 | –2.4178 | 0 |
| KEGG hematopoietic cell lineage | 85 | –2.4134 | 0 |
| KEGG regulation of actin cytoskeleton | 211 | –2.3911 | 0 |
| KEGG allograft rejection | 35 | –2.3902 | 0 |
| KEGG autoimmune thyroid disease | 50 | –2.3778 | 0 |
| Hallmark gene sets | |||
| Hallmark epithelial-mesenchymal transition | 197 | –3.2473 | 0 |
| Hallmark inflammatory response | 197 | –3.0190 | 0 |
| Hallmark interferon gamma response | 197 | –2.9964 | 0 |
| Hallmark interferon alpha response | 94 | –2.9491 | 0 |
| Hallmark allograft rejection | 199 | –2.9010 | 0 |
| Hallmark G2M checkpoint | 194 | –2.6389 | 0 |
| Hallmark E2F targets | 196 | –2.6177 | 0 |
| Hallmark TNF-a signaling via NF-kB | 198 | –2.5512 | 0 |
| Hallmark complement | 195 | –2.5475 | 0 |
| Hallmark mTORC1 signaling | 198 | –2.441 | 0 |
Top-ranked categories of GO-BP, KEGG and Hallmark gene sets for the basal subtype.
All abbreviations are the same as in Table 1. A negative NES value indicates that genes over-represented in the gene set are upregulated in the basal subtype.
Differentially Expressed Genes and Machine Learning
The DEGs that could distinguish tumor from normal samples were analyzed and visualized as volcano plots (Supplementary Figures S2A–C). In total, 208 miRNAs (148 upregulated and 60 downregulated), 2488 lncRNAs (1402 upregulated and 1086 down-regulated), and 4167 mRNAs (2314 upregulated and 1853 downregulated) are differentially expressed.
We applied DTs, RF, and XGBoost for the basal and luminal subtype classification based on mRNA, miRNA, and lncRNA expression dataset, and AUC was used to evaluate their performance. As shown in Figure 2A, XGBoost outperforms RF and DTs, having AUC values of 98.6, 94.5, and 98.7%, respectively, in mRNA, miRNA and lncRNA-based classification. Details regarding 10-fold cross-validation procedure can be found in Supplementary Figure S3. DTs was excluded in the following comparison, as it is significantly inferior to RF and XG on average. By using the CC method, the GSE32894 dataset containing 28 biomarkers and 190 samples was grouped into two subtypes prepared for the classification task. The heatmap plots and the K–M curves for the two subtypes were shown in Supplementary Figure S4. We trained the well-performed classifiers (RF and XG) on mRNA dataset that was derived from TCGA and tested them on GSE32894 dataset. The results demonstrated that XGBoost has a better performance than RF (Figure 2A4).
FIGURE 2
The intersection between DEGs and feature genes obtained by RF and XG was defined as DEFGs, which includes 57 lncRNAs, 120 miRNAs, and 278 mRNAs. The Upset plot and heatmap plots for DEFGs were shown in Figure 2B and Supplementary Figures S2D–F. The genetic and clinical information of DEFGs was visualized in Figure 2C. GO enrichment analysis indicated that differentially expressed feature mRNAs are enriched with adherens junction, cell-substrate junction, cell-cell junction, cell-substrate adherens junction, and focal adhesion (Figure 2D). These GO terms have been found to play roles in tumorigenesis and tumor progression by regulating T-cell signaling, innate immunity, TGF-β signaling, and Wnt signaling through post-translational modification (; ; ; ; ).
Construction of Subtype-Related mRNA-miRNA-lncRNA Network
A miRNA-dependent mRNA-lncRNA co-expression network was constructed, which consists of 90 mRNAs, 22 miRNAs, and 14 lncRNAs (Figure 3A). The miRNA-dependent mRNA-lncRNA crosstalks verified in miRWalk database contain four miRNA-mediated mRNA-lncRNA interactions (Figure 3B). To be specific, two co-expressed lncRNA-mRNA pairs, AC010326.3-GATA3 and AC073335.2-GATA3, are positively regulated by miR-141-3p; The lncRNA-mRNA pairs, such as MIR100HG–CLIC4 and MIR100HG–PALLD, are negatively regulated by miR-200c-3p and miR-141-5p, respectively. All the nine genes in the network differ in their expression between the two subtypes (Figure 3C). For instance, as compared to the luminal subtype, the basal subtype is characterized by a lower expression level of six genes (miR-200c-3p, miR-141-3p, miR-141-5p, GATA3, AC010326.3, and AC073335.2) and a higher expression level of the other three genes (MIR100HG, PALLD, and CLIC4), suggesting that all the nine genes can be used as potential markers for the two MIBC subtypes. In addition, GO analysis showed that the mRNAs in the network (CLIC4, PALLD, and GATA3) are related to cytoskeleton.
FIGURE 3
Survival Analysis of Crosstalk-Involved Genes
The association between expression levels of crosstalk-involved genes and MIBC prognosis was analyzed by K–M method. Strikingly, the results revealed that all of them are closely related to prognosis of MIBC. Specifically, the higher expression level of miR-141-5p, miR-141-3p, AC010326.3, AC073335.2, miR-200c-3p, and GATA3 predicts better prognosis, indicating that they may function as tumor suppressors (Figures 4B–F,H); In contrast, the higher expression level of MIR100HG, PALLD, and CLIC4 is associated with worse prognosis, suggesting that they may play an oncogenic role (Figures 4A,G,I). In addition, the association between MIBC prognosis and expression levels of crosstalk-related mRNAs (CLIC4, PALLD, and GATA3) was validated in two independent microarray datasets (GSE13507 and GSE31684), suggesting again their prognosis value in MIBC (Supplementary Figure S5).
FIGURE 4
Discussion
In this study, we have investigated miRNA-dependent mRNA-lncRNA interactions in MIBC basal and luminal subtypes using bioinformatics approaches. On the basis of MIBC mRNA, miRNA, and lncRNA expression datasets obtained from TCGA, 403 MIBC samples were reliably classified into two intrinsic molecular types, which resemble basal and luminal subtypes identified previously (). A number of subtype-related pathways were identified through GSEA. Moreover, we conducted and compared subtype classification performance among tree-based machine learning algorithms, and found XGBoost outperforms other classifiers. Additionally, we implemented a gene co-expression analysis on DEFGs and successfully identified subtype-specific mRNA-lncRNA crosstalks, which differ considerably between basal and luminal subtypes and have close relationships with the prognosis of MIBC.
Subtype-related pathways presented in this study (Tables 1, 2) are largely consistent with the previously identified (; ; ; ; ; ; ). In general, pathways that are involved in the EMT, metastasis, and immune system process, are upregulated in the basal subtype, whereas, metabolic-related pathways are upregulated in the luminal subtype. Th pathways enriched in basal and luminal subtypes provide a biological explanation for their distinctively different clinical and pathological behaviors. However, the mechanisms by which some other pathways shown in our results, like valine leucine, isoleucine degradation, autoimmune thyroid disease, hematopoietic cell lineage and viral myocarditis, play a role in MIBC subtypes deserve further investigation.
Many machine learning methods have been broadly applied in many areas of biology such as gene family classification, hepatotoxicity prediction, RNA methylation prediction, cancer prediction and classification (Zou et al., 2014; ; , ; ; ,). As suggested in previous studies, RF is a powerful classifier for classifying gene expression data (; ; ). And XGBoost keeps winning in “every” Kaggle competition and has become a really popular tool among data scientists (; ; ). Recently, XGBoost has been successfully applied to many classification problems, such as pan-cancer classification () and prediction of RNA-protein interactions (). However, no comparison between RF and XGBoost in terms of cancer classification has been made in the past. In this study, we compared the performance of DTs, RF, XGBoost in classifying basal and luminal subtypes. Our results clearly demonstrated the advantage of XGBoost in gene expression data-based cancer classification (Figure 2A).
Previous studies investigated MIBC-associated miRNAs and their target genes without considering the genetic heterogeneity of MIBC subtypes (; ; ; ). It is therefore important to elucidate the subtype-related molecular pathways and identify novel biomarkers for MIBC subtypes. In this study, we systematically explored MIBC subtype-related gene co-expression networks. A total of three mRNAs (GATA3, CLIC4, and PALLD), three miRNAs (miR-200c-3p, miR-141-3p, and miR-141-5p), and three lncRNAs (AC010326.3, AC073335.2, and MIR100HG) were found in miRNA-mediated mRNA-lncRNA crosstalks, which differ considerably in their expression between basal and luminal subtypes (Figure 3), and their expression level is significantly associated with the prognosis of MIBC (Figure 4). It was previously observed that miR-141-5p, miR-141-3p, miR-200c-3p, and GATA3 are the most important markers of luminal subtype, which is consistent with our results (). Besides, previous studies found that the down-regulation of miR-200c and miR-141 is associated with elevated ZEB1 (; ; ), and the down-regulation of miR-200c is also coupled with the down-regulation of BMI-1 and E2F3 (), which play an important role in the invasion, migration, and EMT of bladder cancer.
It has been shown that some other genes in the crosstalk are also closely related to cancer. For example, AC073335.2, a highly expressed lncRNA in human glioblastoma, is involved in tumorigenesis via acting as a competing endogenous RNA of miR-940 (). MIR100HG was previously reported to act as a regulator of hematopoiesis and oncogenes in many cancers (; ; ; ; ). In agreement with our findings, MIR100HG was reported to be down-regulated in MIBC and may serve as a significant biomarker for MIBC (). As reported previously, GATA3 is a prognostic marker and inhibits cell migration and invasion in MIBC (; ,). And, GATA3 is differentially expressed between basal and luminal subtypes and can be used as a luminal-infiltrated marker (). CLIC4 has a complicated role in cancer. For instance, it functions as a tumor suppressor in lung adenocarcinomas (). And it promotes the metastasis and development of colorectal cancer (; ). Previous studies have established that the expression of CLIC4 in MIBC has a subtype-dependent pattern (). And the overexpression of CLIC4 in stroma increases cell migration and invasion and promotes epithelial to mesenchymal transition in multiple human cancers (). PALLD SNPs were reported to be a significant predictor of prostate cancer-specific mortality (). Our findings are largely consistent with previously reported results, suggesting crosstalk-implicated genes might be of great significance in MIBC pathogenesis and post-transcriptional gene regulation.
The combination of bioinformatics and several machine learning approaches in this study have achieved reliable results regarding the MIBC subtype classification, subtype-associated pathways, and the network-associated markers for MIBC subtypes. The subtype-related genes can not only be used for subtype classification but also serve as a good predictor of cancer prognosis. It is worth noting that we can enhance our study in the following aspects in the future: (1) the crosstalks discovered through computational analyses need to be verified by biological experiments. (2) DEFGs were defined as the overlap between DEGs and feature genes that were determined by XGBoost based on the ranking approximates of Information Gain. This procedure may result in the missing of some highly correlated genes that are also biologically important.
Conclusion
By conducting bioinformatics analyses, we identified two subtypes of MIBC and lncRNA-mRNA crosstalks mediated by miR-200c and miR-141, which are found to be significantly associated with prognosis, formation, and metastasis of bladder cancer. Our results should be informative for molecular subtype classification, prognosis and molecule-targeted therapy of bladder cancer.
Statements
Author contributions
GjL and ZC performed the computations. MB and IT contributed to data preparation and analysis. GjL and GqL wrote the manuscript. GqL and ID conceived and designed the study.
Funding
This work was supported by grants from the National Natural Science Foundation of China (31660322), Inner Mongolia Natural Science Foundation of China (2018LH03023), IIP UB RAS project (No. AAAA-A18-118020590108-7), Science Foundation for Excellent Youth Scholars of Inner Mongolia University of Science and Technology (2016YQL06), and Act 211 Government of the Russian Federation (No. 02.A03.21.0006).
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2018.00422/full#supplementary-material
FIGURE S1The graphs show the evaluation output of ACW, CPCC, ΔK, and PAC. CoC datasets represented by green line were used as the criteria to infer optimal K. (A) ASW allows us to inference the optimal K by high ASW. (B) The optimal K according to CPCC is that the magnitude of CPCC should be very close to one. (C) The optimal K according to ΔK is the K value before the ‘elbow’ or the K where D(K) reaches its maximum. (D) PAC allows us to inference the optimal K by the lowest PAC.
FIGURE S2Volcano plots for DEGs and heatmap plots for DEFGs. (A–C) Volcano plots for differentially expressed 4167 mRNAs, 208 miRNAs, and 2488 lncRNAs between tumor and normal samples (adjusted p-value < 0.05 and |log2fold change| > 0.57). (D–F) Heatmap plots for 278 DEFmRNAs, 120 DEFmiRNAs, and 57 DEFlncRNAs. Basal, luminal, and normal samples are represented by the red, blue, and yellow bar, respectively.
FIGURE S3Parameter selection and Performance of RF and XG in mRNA, miRNA and lncRNA dataset. (A) The x-axis represents the number of mtry set for RF classifier (1, 5, 10, 15, 20, 25). The y-axis represents the corresponding AUC. (B) The x-axis represents the number of ntree set for RF (20, 400, 600, 800). The y-axis represents corresponding obb error rates. The colors correspond to mtry numbers. (C) The x-axis represents the number of fold set for RF. The y-axis represents corresponding accuracy. The red color shows mean accuracy. (D) The x-axis represents the number of iter set for XG (1, 400, 800, 1200, 1600, 2000, 2400) and the y-axis represents the corresponding accuracy.
FIGURE S4Heatmap and K–M plots for basal and luminal subtypes of GSE32894. (A) Heatmap depicts the expression profiles of basal (up) and luminal (down) biomarkers in GSE32894. The yellow and turquoise color corresponds to high and low relative expression, respectively. B. A K-M plot for the overall 5-year survival of basal and luminal subtypes (basal = 52, luminal = 62, p < 0.01).
FIGURE S5Kaplan-Meier plots for CLIC4, PALLD, and GATA3 in GSE13507 and GSE31684. (A–C) K–M survival curves showing overall survival according to high expression and low expression of CLIC4, PALLD, and GATA3 in GSE13507. (D–F) K–M survival curves showing overall survival according to high expression and low expression of CLIC4, PALLD, GATA3, and MIR100HG in GSE31684.
Abbreviations
- ASW
Average of Silhouette Width
- CC
cellular component
- CC
consensus clustering
- CDF
cumulative density function
- CoC
cluster of cluster
- CPCC
cophenetic correlation coefficient
- DEFGs
differentially expressed feature genes
- DEGs
differentially expressed genes
- DTs
decision trees
- EMT
epithelial-mesenchymal transition
- FDR
False Discovery Rate
- GO-BP
Gene Ontology Biological Process
- GSEA
gene set enrichment analysis
- KEGG
Kyoto Encyclopedia of Genes and Genomes
- K–M curve
Kaplan–Meier curve
- MAD
median absolute deviation
- MF
molecular function
- MIBC
muscle-invasive bladder cancer
- NES
normalized enrichment scores
- PAC
proportion of ambiguous clustering
- RF
random forest
- ROC
receiver operating characteristics curve
- XGBoost
eXtreme Gradient Boosting
References
1
BakerS. C.ArltV. M.IndraR.JoelM.StiborováM.EardleyI.et al (2018). Differentiation-associated urothelial cytochrome P450 oxidoreductase predicates the xenobiotic-metabolizing activity of “luminal” muscle-invasive bladder cancers.Mol. Carcinog.57606–618. 10.1002/mc.22784
2
BaoB.-Y.PaoJ.-B.HuangC.-N.PuY.-S.ChangT.-Y.LanY.-H.et al (2011). Polymorphisms inside microRNAs and microRNA target sites predict clinical outcomes in prostate cancer patients receiving androgen-deprivation therapy.Clin. Cancer Res.17928–936. 10.1158/1078-0432.CCR-10-2648
3
CairnsR. A.HarrisI. S.MakT. W. (2011). Regulation of cancer cell metabolism.Nat. Rev. Cancer1185–95. 10.1038/nrc2981
4
ChenJ. F.MandelE. M.ThomsonJ. M.WuQ.CallisT. E.HammondS. M.et al (2006). The role of microRNA-1 and microRNA-133 in skeletal muscle proliferation and differentiation.Nat. Genet.38228–233. 10.1038/ng1725
5
ChenT.GuestrinC. (2016). “Xgboost: a scalable tree boosting system,” inProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, 785–794. 10.1145/2939672.2939785
6
ChoY.KangH. G.KimS.-J.LeeS.JeeS.AhnS. G.et al (2018). Post-translational modification of OCT4 in breast cancer tumorigenesis.Cell Death Differ.10.1038/s41418-018-0079-6 [Epub ahead of print].
7
ChoiW.CzerniakB.OchoaA.SuX.Siefker-RadtkeA.DinneyC.et al (2014a). Intrinsic basal and luminal subtypes of muscle-invasive bladder cancer.Nat. Rev. Urol.11400–410. 10.1038/nrurol.2014.129
8
ChoiW.PortenS.KimS.WillisD.PlimackE. R.Hoffman-CensitsJ.et al (2014b). Identification of distinct basal and luminal subtypes of muscle-invasive bladder cancer with different sensitivities to frontline chemotherapy.Cancer Cell25152–165. 10.1016/j.ccr.2014.01.009
9
DamrauerJ. S.HoadleyK. A.ChismD. D.FanC.TiganelliC. J.WobkerS. E.et al (2014). Intrinsic subtypes of high-grade bladder cancer reflect the hallmarks of breast cancer biology.Proc. Nati. Acad. Sci. U.S.A.1113110–3115. 10.1073/pnas.1318376111
10
DengY.-J.TangN.LiuC.ZhangJ.-Y.AnS.-L.PengY.-L.et al (2014). CLIC4, ERp29, and Smac/DIABLO derived from metastatic cancer stem-like cells stratify prognostic risks of colorectal cancer.Clin. Cancer Res.203809–3817. 10.1158/1078-0432.CCR-13-1887
11
DweepH.StichtC.PandeyP.GretzN. (2011). miRWalk – Database: Prediction of possible miRNA binding sites by “walking” the genes of three genomes.J. Biomed. Inform.44839–847. 10.1016/j.jbi.2011.05.002
12
EmmrichS.StreltsovA.SchmidtF.ThangapandiV.ReinhardtD.KlusmannJ.-H. (2014). LincRNAs MONC and MIR100HG act as oncogenes in acute megakaryoblastic leukemia.Mol. Cancer13:171. 10.1186/1476-4598-13-171
13
FrazeeA. C.PerteaG.JaffeA. E.LangmeadB.SalzbergS. L.LeekJ. T. (2015). Ballgown bridges the gap between transcriptome assembly and expression analysis.Nat. Biotechnol.33243–246. 10.1038/nbt.3172
14
GontanC.AchameE. M.DemmersJ.BarakatT. S.RentmeesterE.van IJckenW.et al (2012). RNF12 initiates X-chromosome inactivation by targeting REX1 for degradation.Nature485386–390. 10.1038/nature11070
15
HatfieldS. D.ShcherbataH. R.FischerK. A.NakaharaK.CarthewR. W.Ruohola-BakerH. (2005). Stem cell division is regulated by the microRNA pathway.Nature435974–978. 10.1038/nature03816
16
HauA. M.NakasakiM.NakashimaK.KrishG.HanselD. E. (2017). Differential mTOR pathway profiles in bladder cancer cell line subtypes to predict sensitivity to mTOR inhibition.Urol. Oncol.35593–599. 10.1016/j.urolonc.2017.03.025
17
HuH.WilsonK. D.ZhongS.HeC. (2017). LincRNAs: systemic computational identification and functional exploration.Curr. Bioinform.1234–42. 10.2174/15748936116661609231259
18
HuangM.ZhongZ.LvM.ShuJ.TianQ.ChenJ. (2016). Comprehensive analysis of differentially expressed profiles of lncRNAs and circRNAs with associated co-expression and ceRNA networks in bladder carcinoma.Oncotarget747186–47200. 10.18632/oncotarget.9706
19
HuangT.LiB.-Q.CaiY.-D. (2016). The integrative network of gene expression, MicroRNA, methylation and copy number variation in colon and rectal cancer.Curr. Bioinform.1159–65. 10.2174/1574893611666151119215823
20
HurstC. D.KnowlesM. A. (2014). Molecular subtyping of invasive bladder cancer: time to divide and rule?Cancer Cell25135–136. 10.1016/j.ccr.2014.01.026
21
IshwaranH.KogalurU. B.GorodeskiE. Z.MinnA. J.LauerM. S. (2010). High-dimensional variable selection for survival data.J. Am. Stat. Assoc.105205–217. 10.1198/jasa.2009.tm08622
22
IyerG.Al-AhmadieH.SchultzN.HanrahanA. J.OstrovnayaI.BalarA. V.et al (2013). Prevalence and co-occurrence of actionable genomic alterations in high-grade bladder cancer.J. Clin. Oncol.313133–3140. 10.1200/JCO.2012.46.5740
23
JainD. S.GupteS. R.AduriR. (2018). A data driven model for predicting RNA-protein interactions based on gradient boosting machine.Sci. Rep.8:9552. 10.1038/s41598-018-27814-2
24
JohnstonR. J.Jr.HobertO. (2003). A microRNA controlling left/right neuronal asymmetry in Caenorhabditis elegans.Nature426845–849. 10.1038/nature02255
25
KamatA. M.HahnN. M.EfstathiouJ. A.LernerS. P.MalmströmP.-U.ChoiW.et al (2016). Bladder cancer.Lancet3882796–2810. 10.1016/S0140-6736(16)30512-8
26
KikuchiA.KishidaS.YamamotoH. (2006). Regulation of Wnt signaling by protein-protein interaction and post-translational modifications.Exp. Mol. Med.381–10. 10.1038/emm.2006.1
27
KourouK.ExarchosT. P.ExarchosK. P.KaramouzisM. V.FotiadisD. I. (2015). Machine learning applications in cancer prognosis and prediction.Comput. Struct. Biotechnol. J.138–17. 10.1016/j.csbj.2014.11.005
28
KuhnM.WestonS.CulpM.CoulterN.QuinlanR. (2018). Package “C50.
29
KuwabaraT.MatsuiY.IshikawaF.KondoM. (2018). Regulation of T-cell signaling by post-translational modifications in autoimmune disease.Int. J. Mol. Sci.19:819. 10.3390/ijms19030819
30
LeeJ. W.LeeJ. B.ParkM.SongS. H. (2005). An extensive comparison of recent classification tools applied to microarray data.Comput. Stat. Data Anal.48869–885. 10.1016/j.csda.2004.03.017
31
LiY.KangK.KrahnJ. M.CroutwaterN.LeeK.UmbachD. M.et al (2017). A comprehensive genomic pan-cancer classification using The Cancer Genome Atlas gene expression data.BMC Genomics18:508. 10.1186/s12864-017-3906-0
32
LiaoZ.LiD.WangX.LiL.ZouQ. (2018). Cancer diagnosis through IsomiR expression with machine learning method.Curr. Bioinform.1357–63. 10.2174/1574893611666160609081155
33
LiaoZ.WanS.HeY.ZouQ. (2017). Classification of small GTPases with hybrid protein features and advanced machine learning techniques.Curr. Bioinform.13492–500. 10.2174/1574893612666171121162552
34
LiawA.WienerM. (2002). Classification and regression by randomForest.R News218–22.
35
LiuJ.QianC.CaoX. (2016). Post-translational modification control of innate immunity.Immunity4515–30. 10.1016/J.IMMUNI.2016.06.020
36
LiuL.QiuM.TanG.LiangZ.QinY.ChenL.et al (2014). miR-200c Inhibits invasion, migration and proliferation of bladder cancer cells through down-regulation of BMI-1 and E2F3.J. Transl. Med.12:305. 10.1186/s12967-014-0305-z
37
LönnP. (2010). Regulation of TGF-β Signaling by Post-Translational Modifications.Doctoral dissertation, Acta Universitatis Upsaliensis, Uppsala.
38
MahdavinezhadA.Mousavi-BaharS. H.PoorolajalJ.YadegarazariR.JafariM.ShababN.et al (2015). Evaluation of miR-141, miR-200c, miR-30b expression and clinicopathological features of bladder cancer.Int. J. Mol. Cell. Med.432–39.
39
Martens-UzunovaE. S.BöttcherR.CroceC. M.JensterG.VisakorpiT.CalinG. A. (2014). Long noncoding RNA in prostate, bladder, and kidney cancer.Eur. Urol.651140–1151. 10.1016/j.eururo.2013.12.003
40
McConkeyD. J.ChoiW.OchoaA.DinneyC. P. (2016). Intrinsic subtypes and bladder cancer metastasis.Asian J. Urol.3260–267. 10.1016/j.ajur.2016.09.009
41
MiyamotoH.IzumiK.YaoJ. L.LiY.YangQ.McMahonL. A.et al (2012). GATA binding protein 3 is down-regulated in bladder cancer yet strong expression is an independent predictor of poor prognosis in invasive tumor.Hum. Pathol.432033–2040. 10.1016/j.humpath.2012.02.011
42
MontiS.TamayoP.MesirovJ.GolubT. (2003). Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data.Mach. Learn.5291–118. 10.1023/A:1023949509487
43
NairS. (2016). Current insights into the molecular systems pharmacology of lncRNA-miRNA regulatory interactions and implications in cancer translational medicine.AIMS Mol. Sci.3104–124. 10.3934/molsci.2016.2.104
44
NetworkT. C. G. A. R. (2014). Comprehensive molecular characterization of urothelial bladder carcinoma.Nature507315–322. 10.1038/nature12965
45
OchoaA. E.ChoiW.SuX.Siefker-RadtkeA.CzerniakB.DinneyC.et al (2016). Specific micro-RNA expression patterns distinguish the basal and luminal subtypes of muscle-invasive bladder cancer.Oncotarget780164–80174. 10.18632/oncotarget.13284
46
OkudelaK.KatayamaA.WooT.MitsuiH.SuzukiT.TateishiY.et al (2014). Proteome analysis for downstream targets of oncogenic KRAS - the potential participation of CLIC4 in carcinogenesis in the lung.PLoS One9:e87193. 10.1371/journal.pone.0087193
47
Oliveira-CarvalhoV.CarvalhoV. O.SilvaM. M.GuimarãesG. V.BocchiE. A. (2012). MicroRNAs: a new paradigm in the treatment and diagnosis of heart failure?Arq. Bras. Cardiol.98362–370. 10.1590/S0066-782X2012000400011
48
PerettiM.AngeliniM.SavalliN.FlorioT.YuspaS. H.MazzantiM. (2015). Chloride channels in cancer: focus on chloride intracellular channel 1 and 4 (CLIC1 AND CLIC4) proteins in tumor development and as novel therapeutic targets.Biochim. Biophys. Acta18482523–2531. 10.1016/J.BBAMEM.2014.12.012
49
PerouC. M.SørlieT.EisenM. B.van de RijnM.JeffreyS. S.ReesC. A.et al (2000). Molecular portraits of human breast tumours.Nature406747–752. 10.1038/35021093
50
PratA.ParkerJ. S.KarginovaO.FanC.LivasyC.HerschkowitzJ. I.et al (2010). Phenotypic and molecular characterization of the claudin-low intrinsic subtype of breast cancer.Breast Cancer Res.12:R68. 10.1186/bcr2635
51
RenX.GuoH.LiS.WangS.LiJ. (2017). “A novel image classification method with CNN-XGBoost model,” inProceedings of the International Workshop on Digital Watermarking, Magdeburg, 378–390. 10.1007/978-3-319-64185-0_28
52
RobertsonA. G.KimJ.Al-AhmadieH.BellmuntJ.GuoG.CherniackA. D.et al (2017). Comprehensive molecular characterization of muscle-invasive bladder cancer.Cell171540–556.e25. 10.1016/j.cell.2017.09.007
53
SalmenaL.PolisenoL.TayY.KatsL.PandolfiP. P. (2011). A ceRNA hypothesis: the Rosetta stone of a hidden RNA language?Cell146353–358. 10.1016/J.CELL.2011.07.014
54
SayedD.AbdellatifM. (2011). MicroRNAs in development and disease.Physiol. Rev.91827–887. 10.1152/physrev.00006.2010
55
SeilerR.AshabH. A. D.ErhoN.van RhijnB. W. G.WintersB.DouglasJ.et al (2017). Impact of molecular subtypes in muscle-invasive bladder cancer on predicting response and survival after neoadjuvant chemotherapy.Eur. Urol.72544–554. 10.1016/J.EURURO.2017.03.030
56
SjodahlG.LaussM.LovgrenK.ChebilG.GudjonssonS.VeerlaS.et al (2012). A molecular taxonomy for urothelial carcinoma.Clin. Cancer Res.183377–3386. 10.1158/1078-0432.CCR-12-0077-T
57
ShanY.ZhangL.BaoY.LiB.HeC.GaoM.et al (2013). Epithelial-mesenchymal transition, a novel target of sulforaphane via COX-2/MMP2, 9/Snail, ZEB1 and miR-200c/ZEB1 pathways in human bladder cancer cells.J. Nutr. Biochem.241062–1069. 10.1016/J.JNUTBIO.2012.08.004
58
ShangC.ZhuW.LiuT.WangW.HuangG.HuangJ.et al (2016). Characterization of long non-coding RNA expression profiles in lymph node metastasis of early-stage cervical cancer.Oncol. Rep.353185–3197. 10.3892/or.2016.4715
59
ShiJ.WangY.-J.SunC.-R.QinB.ZhangY.ChenG. (2017). Long noncoding RNA lncHERG promotes cell proliferation, migration and invasion in glioblastoma.Oncotarget8108031–108041. 10.18632/oncotarget.22446
60
ShuklaA.EdwardsR.YangY.HahnA.FolkersK.DingJ.et al (2014). CLIC4 regulates TGF-β-dependent myofibroblast differentiation to produce a cancer stroma.Oncogene33842–850. 10.1038/onc.2013.18
61
SongR.WalentekP.SponerN.KlimkeA.LeeJ. S.DixonG.et al (2014). miR-34/449 miRNAs are required for motile ciliogenesis by repressing cp110.Nature510115–120. 10.1038/nature13413
62
SuR.WuH.XuB.LiuX.WeiL. (2018). Developing a multi-dose computational model for drug-induced hepatotoxicity prediction based on toxicogenomics data.IEEE/ACM Trans. Comput. Biol. Bioinform.10.1109/TCBB.2018.2858756 [Epub ahead of print].
63
SubramanianA.TamayoP.MoothaV. K.MukherjeeS.EbertB. L.GilletteM. A.et al (2005). Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.Proc. Natl. Acad. Sci. U.S.A.10215545–15550. 10.1073/pnas.0506580102
64
SunQ.CsorbaT.Skourti-StathakiK.ProudfootN. J.DeanC. (2013). R-loop stabilization represses antisense transcription at the Arabidopsis FLC locus.Science340619–621. 10.1126/science.1234848
65
TayY.RinnJ.PandolfiP. P. (2014). The multilayered complexity of ceRNA crosstalk and competition.Nature505344–352. 10.1038/nature12986
66
TherneauT. (2015). A Package for Survival Analysis in S. version 2.38
67
TorlayL.Perrone-BertolottiM.ThomasE.BaciuM. (2017). Machine learning–XGBoost analysis of language networks to classify patients with epilepsy.Brain Inform.4159–169. 10.1007/s40708-017-0065-7
68
Valinezhad OrangA.SafaralizadehR.Kazemzadeh-BaviliM. (2014). Mechanisms of miRNA-mediated gene regulation from common downregulation to mRNA-specific upregulation.Int. J. Genomics 2014: 970607. 10.1155/2014/970607
69
WangH.NiuL.JiangS.ZhaiJ.WangP.KongF.et al (2016). Comprehensive analysis of aberrantly expressed profiles of lncRNAs and miRNAs with associated ceRNA network in muscle-invasive bladder cancer.Oncotarget786174–86185. 10.18632/oncotarget.13363
70
WeiL.ChenH.SuR. (2018a). M6APred-EL: a sequence-based predictor for identifying N6-methyladenosine sites using ensemble learning.Mol. Ther. Nucleic Acids12635–644. 10.1016/j.omtn.2018.07.004
71
WeiL.ZhouC.ChenH.SongJ.SuR. (2018b). ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides.Bioinformatics10.1093/bioinformatics/bty451 [Epub ahead of print].
72
WieczorekE.ReszkaE. (2018). mRNA, microRNA and lncRNA as novel bladder tumor markers.Clin. Chim. Acta.477141–153. 10.1016/j.cca.2017.12.009
73
WiklundE. D.BramsenJ. B.HulfT.DyrskjøtL.RamanathanR.HansenT. B.et al (2011). Coordinated epigenetic repression of the miR-200 family and miR-205 in invasive bladder cancer.Int. J. Cancer1281327–1334. 10.1002/ijc.25461
74
WilkersonM. D.HayesD. N. (2010). ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking.Bioinformatics261572–1573. 10.1093/bioinformatics/btq170
75
WuB.AbbottT.FishmanD.McMurrayW.MorG.StoneK.et al (2003). Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data.Bioinformatics191636–1643. 10.1016/j.csda.2004.03.017
76
XueM.PangH.LiX.LiH.PanJ.ChenW. (2016). Long non-coding RNA urothelial cancer-associated 1 promotes bladder cancer cell migration and invasion by way of the hsa-miR-145-ZEB1/2-FSCN1 pathway.Cancer Sci.10718–27. 10.1111/cas.12844
77
YuG.WangL.-G.HanY.HeQ.-Y. (2012). clusterProfiler: an R package for comparing biological themes among gene clusters.OMICS16284–287. 10.1089/omi.2011.0118
78
YuanJ.YangF.WangF.MaJ.GuoY.TaoQ.et al (2014). A long noncoding RNA activated by TGF-β promotes the invasion-metastasis cascade in hepatocellular carcinoma.Cancer Cell25666–681. 10.1016/J.CCR.2014.03.010
79
ZengX.ZhangX.ZouQ. (2016). Integrative approaches for predicting microRNA function and prioritizing disease-related microRNA using biological interaction networks.Brief. Bioinform.17193–203. 10.1093/bib/bbv033
80
ZhangC.CaoS.TooleB. P.XuY. (2015). Cancer may be a pathway to cell survival under persistent hypoxia and elevated ROS: a model for solid-cancer initiation and early development.Int. J. Cancer1362001–2011. 10.1002/ijc.28975
81
ZhangJ.YuanY.WeiZ.RenJ.HouX.YangD.et al (2018). Crosstalk between prognostic long noncoding RNAs and messenger RNAs as transcriptional hallmarks in gastric cancer.Epigenomics10433–443. 10.2217/epi-2017-0136
82
ZhangL.ZhanC. (2017). “Machine learning in rock facies classification: an application of XGBoost,” inProceedings of the International Geophysical Conference, (Society of Exploration Geophysicists and Chinese Petroleum Society) 17-20 April 2017, Qingdao, 1371–1374. 10.1190/IGC2017-351
83
ZhaoY.SamalE.SrivastavaD. (2005). Serum response factor regulates a muscle-specific microRNA that targets Hand2 during cardiogenesis.Nature436214–220. 10.1038/nature03817
84
ZhongZ.LvM.ChenJ. (2016). Screening differential circular RNA expression profiles reveals the regulatory role of circTCF25-miR-103a-3p/miR-107-CDK6 pathway in bladder carcinoma.Sci. Rep.6:30919. 10.1038/srep30919
85
ZouQ.MaoY.HuL.WuY.JiZ. (2014). miRClassify: an advanced web server for miRNA family classification and annotation.Comput. Biol. Med.45157–160. 10.1016/j.compbiomed.2013.12.007
86
ŞenbabaogluY.MichailidisG.LiJ. Z. (2014). Critical limitations of consensus clustering in class discovery.Sci. Rep.4:6207. 10.1038/srep06207
Summary
Keywords
muscle-invasive bladder cancer, subtypes, miR200c, miR-141, random forest, XGBoost
Citation
Liu G, Chen Z, Danilova IG, Bolkov MA, Tuzankina IA and Liu G (2018) Identification of miR-200c and miR141-Mediated lncRNA-mRNA Crosstalks in Muscle-Invasive Bladder Cancer Subtypes. Front. Genet. 9:422. doi: 10.3389/fgene.2018.00422
Received
15 June 2018
Accepted
10 September 2018
Published
28 September 2018
Volume
9 - 2018
Edited by
Quan Zou, Tianjin University, China
Reviewed by
Chi Zhang, Indiana University Bloomington, United States; Qing Li, University of Utah, United States; Leyi Wei, Tianjin University, China
Updates
Copyright
© 2018 Liu, Chen, Danilova, Bolkov, Tuzankina and Liu.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Guoqing Liu, gqliu1010@163.com
This article was submitted to Bioinformatics and Computational Biology, a section of the journal Frontiers in Genetics
Disclaimer
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.