ORIGINAL RESEARCH article
Prognostic Biomarkers on a Competitive Endogenous RNA Network Reveals Overall Survival in Triple-Negative Breast Cancer
- 1Department of Oncology, Changzheng Hospital, Naval Medical University, Shanghai, China
- 2Liver Cancer Institute, Zhongshan Hospital, Fudan University, Shanghai, China
- 3Department of Gastroenterology, Changhai Hospital, Naval Medical University, Shanghai, China
The objective of this study was to construct a competitive endogenous RNA (ceRNA) regulatory network using differentially expressed long noncoding RNAs (lncRNAs), microRNAs (miRNAs), and mRNAs in patients with triple-negative breast cancer (TNBC) and to construct a prognostic model for predicting overall survival (OS) in patients with TNBC. Differentially expressed lncRNAs, miRNAs, and mRNAs in TNBC patients from the TCGA and Metabric databases were examined. A prognostic model based on prognostic scores (PSs) was established for predicting OS in TNBC patients, and the performance of the model was assessed by a recipient that operated on a distinctive curve. A total of 874 differentially expressed RNAs (DERs) were screened, among which 6 lncRNAs, 295 miRNAs and 573 mRNAs were utilized to construct targeted and coexpression ceRNA regulatory networks. Eight differentially expressed genes (DEGs) associated with survival prognosis, DBX2, MYH7, TARDBP, POU4F1, ABCB11, LHFPL5, TRHDE and TIMP4, were identified by multivariate Cox regression and then used to establish a prognostic model. Our study shows that the ceRNA network has a critical role in maintaining the aggressiveness of TNBC and provides comprehensive molecular-level insight for predicting individual mortality hazards for TNBC patients. Our data suggest that these prognostic mRNAs from the ceRNA network are promising therapeutic targets for clinical intervention.
Breast cancer is the most frequently diagnosed cancer and the second most common cause of cancer mortality in women worldwide (1). TNBC, as an aggressive subtype, accounts for 12-18% of all breast cancers (2). Because TNBC patients lack the oestrogen receptor (ER), progesterone receptor (PR) and HER2 receptors, they are not suitable for hormone or anti-HER2 therapy. Targeted therapies cannot significantly improve the survival rate of TNBC patients, and chemotherapy is still the standard treatment. Consequently, exploring the molecular biological mechanism affecting the prognosis of patients with TNBC and identifying reliable prognostic markers are very valuable for accelerating individual therapies and improving clinical prognoses.
Over the past decade, substantial efforts have been made to classify TNBC into distinct clinical and molecular subtypes to guide treatment decisions. The characterization of genomic, proteomic, epigenomic and microenvironmental changes has expanded our understanding of TNBC. High levels of somatic mutations, frequent TP53 mutations (83%) and complex aneuploidy rearrangement (80%) have been found in TNBC patients via deep sequencing studies (3), multiregion sequencing analyses (4) and single-cell sequencing research (5), revealing extensive intratumoural heterogeneity (ITH). However, the molecular mechanism driving TNBC relapse has not been fully elucidated.
The ceRNA hypothesis involves a specific molecular biological regulatory mechanism of posttranscriptional regulation (6). Studies have explored the mechanism of ceRNA biological regulation in TNBC patients. Yuan et al. (7) proposed a ceRNA crosstalk network in triple-negative breast cancer via integrative analysis of lncRNAs and miRNAs with coding RNAs. Song et al. (8) established a ceRNA topology network with five specific RNAs: hsa-miR-133a-3p, hsa-miR-1-3p, TRIML2, TERT and PHBP4. However, due to their complex formulas, unclear results, lack of external validation and other reasons, these prediction models do not predict prognostic effects very well. A prognostic model with a simple formula, easily interpretable results and strong external repeatability are very important for optimizing individualized treatment. Therefore, the purpose of this study was to develop and validate a nomogram for determining OS prognoses in patients with TNBC based on gene expression data confirmed by the ceRNA regulatory network and to screen potential therapeutic agents among existing small-molecule inhibitors.
Expression Profile Data Screening
The breast cancer expression data (including mRNAs, lncRNAs and miRNAs) were downloaded from The Cancer Genome Atlas (TCGA) database (https://gdc-portal.nci.nih.gov/), and the Illumina HiSeq 2000 RNA platform was used for sequencing. The downloaded data was TCGA TNBC RNA-seq level 3 with a normalized log (FPKM+ 1,2) that can be used for direct analysis. After comparing the clinical information of the downloaded samples (Table S1), the following breast cancer samples were retained: patients with known expression levels of mRNAs, lncRNAs and miRNAs; patients negative for ER, PR and HER-2; patients with complete pathological staging and follow-up data; and enrolled patients with TNBC primary disease and no other malignant tumours. A total of 102 samples and 63 control samples were obtained, and these samples were used as the training group. Moreover, the RNA-seq breast cancer data were downloaded from the Molecular Taxonomy of Breast Cancer International Consortium (Metabric) database (http://molonc.bccrc.ca/) and included a total of 1904 breast cancer patients with corresponding clinical information. Based on the above screening criteria, we selected 298 TNBC samples, and this data set was used as the validation group. All of the data sets utilized in this study were from public databases. Ethical approval was not required because researchers are allowed download and analyse data from these public databases for scientific purposes.
Data Preprocessing and Screening of Differential Genes
LncRNAs, miRNAs and mRNAs detected in the TCGA samples were annotated. Then, the R language package Limma (version 3.34.0, https://bioconductor.org/packages/release/bioc/html/limma.html) (9) was used to screen for differentially expressed RNAs (DERs) between the TNBC and control (CTRL) groups. The screening threshold was an false discovery rate (FDR) < 0.05 and a |log2FC| > 0.5. The R language package pheatmap (version 1.0.8, https://cran.r-project.org/package=pheatmap) (10) was used to perform bidimensional hierarchical clustering (11, 12) analysis based on the Euclidean distances (13) of the expression levels of the selected DERs, which are displayed in the heatmap.
Construction of a Targeted Regulatory ceRNA Network
We analysed the relationships of differentially expressed (DE) lncRNAs and miRNAs from the DIANA-LncBase database (14) (https://diana.e-ce.uth.gr/lncbasev3) and retained only the pairs with opposite expression trends. The StarBase database (15) (version 2.0, http://starbase.sysu.edu.cn/) was used to search the differentially expressed genes (DEGs) correlated with DEmiRNA regulation, and only the negatively correlated miRNA and mRNA pairs were retained. Based on the above results, a ceRNA regulatory network composed of DElncRNAs, DEmiRNAs, and DEmRNAs was constructed, and the network was visualized with Cytoscape (Version 3.6.1) (https://cytoscape.org/) (16). The regulatory network of ceRNA coexpression was used to calculate the Pearson correlation coefficients (PCCs) between the expression levels of DElncRNAs vs. DEmiRNAs, DElncRNAs vs. DEmRNAs and DEmiRNAs vs DEmRNAs by using the Cor function in R3.6.1 software (http://22.214.171.124/R-help/cor.test.html), and the pairs with significant correlation (P < 0.05) were selected. Finally, The Database for Annotation, Visualization and Integrated Discovery (DAVID) (version 6.8) (17, 18) (http://metascape.org/) was used to conduct Gene Ontology (GO) biological process and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses of the DEGs in the ceRNA regulatory network, and P < 0.05 was selected as the threshold for determining significant enrichment.
Screening of Prognostic-Related Genes
In the TCGA training set, based on the ceRNA regulatory network and the prognostic information obtained from the samples, univariate Cox analysis was used to screen mRNAs correlated with significant overall survival (OS) with R software (survival pack, version 2.41, http://bioconductor.org/packages/survivalr/). Multivariate Cox regression analysis was used to screen for mRNAs with significant independent prognostic correlation using the survival package (version 2.41-1) (19) of R3.6.1. A log-rank P value < 0.05 was selected as the threshold for determining significant prognostic mRNAs.
Establishment of the Prognostic Risk Prediction Model
Based on multifactor Cox regression analysis of mRNAs and their expression levels in the TCGA, we constructed the following prognostic score (PS) model:
Coef mRNAs represents the prognostic coefficients of mRNAs in the multivariate Cox regression analysis, and Exp mRNAs represents the expression levels of mRNAs in the TCGA dataset.
The t-test or Mann‐Whitney U test was used for comparison, and continuous variables are expressed as the mean ± standard deviation. The χ2 test or Fisher’s exact test was used to compare the categorical variables. The prognostic factors were computed using the Cox proportional hazards model, where HR was the 95% confidence interval. OS was defined as the time from the date of diagnosis to the date of last follow-up or death and was analysed by the log-rank test. Nomograms and ROC curves were used to evaluate the predictive performance of the prognostic model. All statistical analyses were computed using SPSS version 22.0 (IBM SPSS Statistics, Chicago, IL, US), and R software (version 3.5.2) with the following packages: “Limma”, “pheatmap”, “survival”, “rmda”, “cor”, “GOplot”, “ROC”, “rms”. P < 0.05 was considered statistically significant.
Data Preprocessing and Screening for Significantly DERs
To explore the significant prognostic correlate factors for triple-negative breast cancer, we selected the patients with triple-negative breast cancer from TCGA and Metabric data sets. There were 102 TNBC patients in the training group and 298 TNBC patients in the validation group. The clinical and demographic characteristics of all patients with TNBC are summarized in Table 1. According to the platform annotation information provided in the downloaded data, 14,000 mRNAs, 1778 lncRNAs and 2222 miRNAs were annotated in the data set. A total of 874 DERs were screened, among which 6 lncRNAs, 295 miRNAs and 573 mRNAs met the screening threshold criteria (Table S2). The inspection volcano distribution map is shown in Figure 1A. The bidirectional hierarchical clustering heat map based on DER expression is presented in Figure 1B. The above DERs were identified as candidate prognosis factors.
Figure 1 Differentially expressed RNAs, including lncRNAs, miRNAs and mRNAs, from the TCGA dataset. (A) The volcano map was constructed. The blue and red dots indicate significantly downregulated and upregulated differentially expressed RNAs, respectively. The horizontal dotted line indicates an FDR < 0.05, and the two vertical dashed lines indicate a | log2FC | > 0.5. (B) Bidirectional hierarchical clustering heat map based on the levels of differentially expressed RNAs; the black and white sample bars represent TNBC and control peritumoral tissue patients, respectively.
Construction of the ceRNA Networks and Functional Analysis
To further explore the biological functions and involvement in signaling pathways of DERs, we constructed the ceRNA networks from DERs including lncRNAs, miRNAs and mRNAs. The ceRNA regulatory networks were constructed in two ways: targeted and coexpression regulation. From the binding relationships between DElncRNAs and DEmiRNAs screened in the above step based on the DIANA-LncBasev2 database, we retained only the pairs exhibiting opposite expression trends, yielding total 19 pairs (Table S3). Then, we searched the target genes regulated by DEmiRNAs by using the StarBase database and matched the significant DEGs obtained in the first step. Moreover, only pairs with negatively correlated miRNA and mRNA expression levels were retained, yielding 50 total pairs. A targeted ceRNA network composed of lncRNAs, miRNAs, and mRNAs was constructed after synthesizing and collating DElncRNAs, DEmiRNAs, and DEmiRNA-DEmRNAs (Figure 2A). Finally, enrichment annotation analyses of GO biological processes and KEGG signalling pathways associated with the mRNAs involved in the ceRNA regulatory network were carried out based on DAVID. We identified 9 significantly related GO biological processes, including neurohypophysis development, synapse assembly, cell differentiation, hypothalamus cell differentiation, regulation of cytokine biosynthetic process, neuron fate specification, positive regulation of filopodium assembly, positive regulation of transcription from RNA polymerase II promoter and transcription, and 4 KEGG signalling pathways, including the ErbB signalling pathway, FoxO signalling pathway, and signalling pathways regulating stem cell pluripotency and proteoglycans in cancer (Figure 2B).
Figure 2 Targeted ceRNA regulatory network. (A) The mutual targeting relationships among lncRNAs, miRNAs and mRNAs. The colour change from green to red indicates a change from low to high. The red and grey lines indicate the DElncRNA-DEmiRNA associations and the DEmiRNA-DEmRNA regulatory associations, respectively. (B) Bar diagram of the GO biological processes and KEGG signalling pathways significantly correlated with mRNAs in the ceRNA network. The horizontal axis shows the number of genes. The vertical axis shows the entry names. Entries closer to a red colour are more significant.
Then, we used the Cor function in R to calculate the PCCs between DElncRNAs and DEmiRNAs, DElncRNAs and DEmRNAs and DEmiRNAs and DEmRNAs and selected the significantly correlated pairs (PCC > 0.4, P < 0.05). Finally, we obtained 69 pairs between DElncRNAs and DEmiRNAs, 158 pairs between DElncRNAs and DEmRNAs and 369 pairs between DEmiRNAs and DEmRNAs (Table S4). The three coexpression relationships were integrated to construct a comprehensive coexpression ceRNA network (Figure 3A). Next, a total of 21 significantly related GO biological processes and 6 KEGG signalling pathways were identified by DAVID based on enrichment annotation analysis of mRNAs in the coexpression regulatory network (Figure 3B). The top five GO terms related to miRNAs in the coexpression ceRNA network were mainly enriched in proteolysis, regulation of apoptotic processes, responses to lipopolysaccharide, chemical synaptic transmission and cell-cell signalling. The KEGG pathways related to these mRNAs were mainly enriched in ABC transporters, the PPAR signalling pathway, proximal tubule bicarbonate reclamation, neuroactive ligand-receptor interaction, adrenergic signalling in cardiomyocytes and glycolysis/gluconeogenesis.
Figure 3 Coexpression ceRNA regulatory network. (A) The coexpression relationships among lncRNAs, miRNAs and mRNAs. The colour change from green to red indicates a change from low to high. (B) Bar diagram of the GO biological processes and KEGG signalling pathways significantly correlated with mRNAs in the ceRNA network. The horizontal axis depicts the number of genes. The vertical axis shows the entry names. Entries closer to a red colour are more significant.
Screening for DEGs Associated With Survival Prognosis
According to 102 TNBC samples in the TCGA training set, a total of 144 mRNAs were identified based on the mRNAs contained in the two ceRNA regulatory networks. Then, 12 DEGs that were significantly associated with survival prognosis were identified by univariate Cox regression analysis (Table S5). Furthermore, multivariate Cox regression analysis was used to identify 8 DEGs associated with survival prognosis, including DBX2, MYH7, TARDBP, POU4F1, ABCB11, LHFPL5, TRHDE and TIMP4 (Table 2). Further, we found that ABCB11 involved in ABC transporters pathways and MYH7 involved in Adrenergic signaling in cardiomyocytes pathways. In GO, MYH7 participated in cardiac muscle contraction, POU4F1 participated in synapse assembly and neuron fate specification, TRHDE participated in regulation of blood pressure, proteolysis and cell-cell signaling, and TIMP4 participated in response to lipopolysaccharide. Then, we calculated the PS scores of the 8 DEGs and divided each gene into a high PS score group (PS score higher than or equal to the median PS score) and a low PS score group (PS score lower than the median PS score) according to the median PS score. Survival curves and log-rank tests were used to analyse the prognosis of each gene in patients (Figures 4A–H). The results showed that the PS scores of all 8 DEGs were significantly associated with survival prognosis. Therefore, a nomogram based on the prognostic factors that combined the 8 DEGs correlated with overall survival is presented in Figure 4I, indicating that the 8 DEGs was able to be an accurate predictor of survival in patients with TNBC.
Figure 4 Establishment of a prognostic model for TNBC patients. (A–H) Survival curves of patients with TNBC, divided into low-risk and high-risk groups, carrying known prognostic genes. (I) Nomogram for the overall survival of TNBC patients.
Evaluation of Effectiveness and Comparison of the Prognostic Models
Based on the prognostic coefficients of the 8 DEGs from the multivariate Cox regression analysis and their expression levels in the TCGA training set obtained in the previous step, we constructed a prognostic score (PS) model and divided all samples from the TGCA training set into a high risk group (PS score equal to or higher than the PS median value) and a low risk group (PS score lower than the PS median value). Moreover, the corresponding mRNA expression levels were extracted from the Metabric validation dataset. Then, we calculated the PS score of each sample and divided them into a high-risk group and a low-risk group. The survival curve and log-rank test showed a predictability of the PS model groups for actual prognostic information for the disease (Figures 5A, B and Table S6). The survival rate of the low-risk group was significantly higher than that of the high-risk group in the TGCA training set (P < 0.01, HR = 10.06) and the Metabric validation set (P = 0.02, HR = 1.437). However, we test this PS model in non-TNBC cases and found that it was not associated with non-TNBC cases prognosis and did not work for non-TNBC (Figures 5C, D). Then, we applied ROC curves to assess the precision of the prognostic model. Similar to the nomogram, ROC curve analysis at 1, 3 and 5 years revealed area under the curve (AUC) values of 0.868, 0.911 and 0.860 in the TGCA training set and 0.748, 0.759 and 0.723 in the Metabric validation set, respectively (Figures 5E, F). In addition, the calibration curve showed that the nomogram-predicted and actual probability of survival at 3 and 5 years were in good agreement (Figures 5G, H). In brief, on the basis of PS model from 8 DEGs, we develop a predictive model for clinical application, which is accurate and effective for TNBC patients.
Figure 5 Validation of the prognostic model. (A, B) Kaplan-Meier curve based on the PS prediction model and prognostic correlation in TNBC groups. (C, D) Kaplan-Meier curve based on the PS prediction model and prognostic correlation in non-TNBC groups. (E, F) The ROC curve of the PS forecast model, with the black, red and blue curves representing the 1-year, 3-year and 5-year curves, respectively. (G, H) Three- and 5-year calibration curves based on the PS forecast models.
Correlation of the Prognostic Model With Clinical Factors
To further analysis the accuracy and application of the PS model compared with other clinical prognosis factors, multivariate Cox regression analyses were firstly carried out to evaluate the survival of TNBC patients using the prognostic model. The multivariate Cox regression analyses showed that pathological stage and prognostic model were independent prognostic factors in the TCGA training dataset. In the Metabric validation dataset, multivariate Cox regression analyses identified the patient’s age, pathologic stage and prognostic model as independent prognostic factors for overall survival (Table 3). Therefore, we performed decision curve analysis (DCA) of a single clinical factor, the PS model and clinical factors in combination with the PS model in the TCGA data set to compare the net effects of each variable on survival prognosis. As shown in Figure 6A, the prognostic model (red line) yielded a higher net profit than age (green line) and the pathological stage (blue line). DCA implied that the prognostic model combined with clinical factors was more beneficial that both a single clinical factor and the PS model by itself for forecasting the survival rate. Further subgroup analysis showed that the pathological stage was a significant factor influencing the prognosis of TNBC patients. As illustrated in Figure 6B, the survival rate in the high-risk group was obviously worse than that in the low-risk group, suggesting that the prognostic model was trustworthy and exhibited a steady predictive performance at distinct pathological subgroup stages. We performed correlation analyses of the 8 DEGs and the prognostic signature with TNBC clinical factors. The outcomes implied that the prognostic signature was considerably correlated with recurrence and the pathological stage (Figure 6C and Table S7).
Figure 6 Effectiveness and comparison of the prognostic model. (A) Decision curve analysis. The green, blue, red and purple colours represent the decision curves for the patient’s age, pathological stage, PS model and combined model, respectively. The grey and black curves show the two extremes. (B) Kaplan-Meier curves were analysed by pathological stage stratification in the TCGA training set and Metabric validation set. (C) Heat map depicting the significance of correlations between clinical characteristics and 8 factors.
Construction of a Network for Small-Molecule Drugs and Characteristic Factors
To explore the underlying treatment value of the PS model from 8 DEGs, we constructed the network of small-molecule drugs targeted 8 DEGs and predicted the possible binding sites. All the relationship data for genes and small-molecule drugs were downloaded from the CTD database, and 12 pairs were obtained from the 8 DEGs selected for construction of the PS model (Figure 7A and Table S8). Further, the docking possibilities between the target proteins and small molecules were assessed using AutoDock software (20). The results revealed 9 pairs from 3 DEGs and 8 small molecules (Table 4). Then, we analysed the local binding sites through PLIP online tools (https://projects.biotec.tu-dresden.de/plip-web/plip/index) and drew 3D structure simulations using PyMOL (version 2.4) (Figures 7B–J). According to the above analysis, the combination of fasudil and MYH7 was the best based on the affinity value (-8.2). Moreover, our results also provide a reference for developing clinical drugs according to characteristic factors.
Figure 7 Correlation of the PS model characteristic features with small-molecule inhibitors. (A) Network of the associations of characteristic factors with small-molecule drugs. The circles and diamonds represent genes and drug molecules, respectively. (B–J) Molecular docking global diagram. Fasudil binding to MYH7 (B), trichostatin A binding to MYH7 (C), pirinixic acid binding to ABCB11 (D), heliotrine binding to ABCB11 (E), boldine binding to ABCB11 (F), enilconazole binding to ABCB11 (G), nimesulide binding to TARDBP (H), pirinixic acid binding to TARDBP (I) and trichostatin A binding to TARDBP (J) are shown.
Noncoding RNAs (ncRNAs), which account for the majority (98%) of the transcriptome, are defined as gene transcripts with important biological functions (21). Among them, lncRNAs are one of the most important challenges biologists face today and represent potential new biomarkers and drug targets. The hypothesis of Salmena et al. (22) regarding ceRNA networks of multiple molecules involved in posttranscriptional regulation has received extensive attention in recent years. Their research showed that lncRNAs can act as ceRNAs to inhibit the function of miRNAs and communicate with mRNAs by carrying one or more miRNA response elements (MREs). However, the molecular biological mechanisms and effective regulatory networks affecting the progression and prognosis of TNBC remain unclear. It is important and helpful to optimize individualized treatments by exploring the molecular biological regulatory network and prognostic biomarkers of TNBC.
In this work, we used bioinformatic analyses to establish a new ceRNA regulatory network in TNBC that was based on 6 lncRNAs, 295 miRNAs, and 573 mRNAs. The ceRNA regulatory network was constructed in two ways: targeted regulation and coexpression regulation. A total of 144 mRNA nodes were found in the network. Among them, 12 mRNAs were significantly related to survival prognosis as determined by univariate Cox regression analysis, and an 8-mRNA prognostic nomogram was finally obtained via further multivariate Cox regression analysis. The 8-mRNA prognostic nomogram was helpful for identifying TNBC patients with a low survival probability. The OS rate in the high-risk group was significantly worse than that in the low-risk group in both the model and validation groups.
Many studies have explored prognostic biomarkers and possible molecular regulatory pathways in breast cancer, but studies on ncRNA-related cancers are commonly concentrated on miRNAs, lncRNAs and circRNAs (23–25). A corresponding lncRNA-miRNA-mRNA prognostic model of OS in TNBC patients has not been constructed. Jiang et al. (26) developed an integrated mRNA-lncRNA signature that effectively classifies TNBC patients into groups at low and high risk of disease recurrence, but the model is susceptible to the inherent biases of the study format and to the biases of the individuals performing the diagnoses and laboratory experiments. Therefore, we identified prognostic mRNA biomarkers and established a prognostic model based on a TCGA data set, while external verification was carried out using an independent external verification data set from the Metabric database. In this study, 8 potential biomarkers were found to be closely related to the OS of TNBC patients. Through the ceRNA regulatory network, potential lncRNA-miRNA-mRNA regulatory pathways are shown, which may help to clarify the potential biological mechanisms and regulatory pathways related to the OS of patients with TNBC. GO and KEGG pathway enrichments were further assessed to determine the potential molecular mechanism underlying the OS of TNBC patients.
Changes in TIMP4 expression have been reported to contribute to the development of breast cancer (27). In fact, TIMP4 is involved in several processes, including cell invasion and migration, cell proliferation and apoptosis, and angiogenesis (28). It can regulate the proteolytic activity of MMPS and is associated with ECM remodelling, EMT progression, and cancer initiation and progression by directly affecting cellular adhesion (29, 30). Different studies have associated high serum levels of MMPS and TIMPS with poor prognosis (31), and these factors were specifically identified as predictors of adverse outcomes in patients with breast cancer (32, 33). POU4F1 is a member of the POU domain family of transcription factors and plays a key role in regulating cancers. Studies have shown that POU4F1 expression is dramatically increased in breast cancer cells. POU4F1 deletion substantially downregulated the MEK1/2 and ERK1/2 signalling pathways in cancer cells (34, 35). TARDBP was originally considered to be an RNA/DNA binding protein and a regulator of HIV-1 gene expression. Increasing evidence shows that TARDBP may be involved in cell division, apoptosis and microRNA (miRNA) biogenesis (36). It is considered a powerful prognostic indicator of survival in breast cancer (37). In addition, ABCB11 mutations are prevalent in the DNA of patients with primary breast cancers and are considered to be associated with tumour prognosis (38). The results of all of these studies are compatible with the biomarker predictions in our study.
There are a few reports on the effects of other factors, such as DBX2, MYH7, TRHDE and LHFPL5, on TNBC. However, these factors are mainly studied in the context of other cancers. DBX2, a hypermethylated gene in the ctDNA of HCC patients, was identified as a potential biomarker and shows great promise for liquid biopsy applications in the future (39). It is significantly upregulated in HCC tissues and plays significant roles in the proliferation and metastasis of HCC cells by activating the Shh pathway (40). MHY7 mutations may play an important role in EBV-associated intrahepatic cholangiocarcinoma (41). TRHDE is reported to be a DNA methylation marker of oral precancer progression (42). Overexpression of the long noncoding RNA TRHDE-AS1 was shown to inhibit the progression of lung cancer via the miRNA-103/KLF4 axis (43). LHFPL5 is a member of the lipoma HMGIC fusion partner (LHFP) gene family and can cause deafness in humans. It is proposed to function in hair bundle morphogenesis. Studies have indicated that these biomarkers play important roles in the occurrence and development of tumours. In addition, our study found that some drugs could target these prognostic biomarkers by AutoDock software, including MYH7 targeted by fasudil and trichostatin A, ABCB11 targeted by boldine, pirinixic acid, enilconazole and heliotrine, and TARDBP targeted by trichostatin A, pirinixic acid and nimesulide. Previous studies showed that fasudil, as a ROCK inhibitor, could control tissue mechanics via regulation of Capzb in liver homeostasis (44). Trichostatin A is A histone deacetylase inhibitor with anti-breast cancer activity (45). Besides, Bodine has hepatoprotective, cell-protective, antipyretic and anti-inflammatory effects by antagonizing 5-HT3 receptors, and also against glioma cell lines (46, 47). Pirinixic acid inhibit 5- lipoxygenase activity and reduce vascular remodelling in the cardiovascular system (48). Some researches also found that enilconazole inhibited metastatic colorectal cancer through inhibition of TGF-β (49). What is more, nimesulide inhibited lung tumor growth through selective cyclocxygenase-2 (50). Therefore, it is necessary to perform experimental studies to elucidate the relevant pathogenesis and biological regulatory pathways in TNBC. Further studies need to be performed.
The current research has the following advantages. First, the prognostic nomograms provide a convenient way to estimate mortality at different time points. Second, a free network calculator of TNBC patient OS was developed and maintained to facilitate the assessment of individual mortality. Finally, as a noninvasive predictive tool, predictive nomograms provide an alternative for TNBC patients who cannot tolerate or do not want to undergo surgery.
Additionally, this study does have some limitations. First, the current study used gene expression data from the gene chip detection platform and lacks basic experimental verification. Second, the research data were generated on testing platforms, thereby affecting the repeatability of the results in different populations. Third, the model and validation data sets did not contain detailed research information, such as the drug treatment regimens and other postoperative treatments, which may affect the treatment efficacy and clinical prognosis.
Briefly, our research revealed potential molecular biological regulatory pathways and prognostic biomarkers through a ceRNA regulatory network. In particular, our prognostic model based on mRNAs in the ceRNA network showed a substantial ability to improve the prediction of OS in TNBC patients.
Data Availability Statement
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.
Ethical approval was not required because researchers are allowed to download and analyse data from these databases for scientific purposes.
FQ, Y-SZ, PL and WQ designed the study, analysed the data and wrote the manuscript. FQ and JL collected the follow-up information and performed the clinical data analysis. FQ, JL, and WQ performed the bioinformatics analysis. All authors contributed to the article and approved the submitted version.
This study was supported by the National Natural Science Foundation of China (grant nos. 81772590 and 81572395).
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2021.681946/full#supplementary-material
Supplementary Table 1 | Clinical information of the samples obtained from the TCGA and Metabric databases.
Supplementary Table 2 | DERs obtained from meeting the threshold conditions.
Supplementary Table 3 | Connection relationship of targeted ceRNA regulatory network.
Supplementary Table 4 | Connection relationship of coexpression ceRNA regulatory network.
Supplementary Table 5 | mRNAs in ceRNA network associated with survival prognosis.
Supplementary Table 6 | PS score and grouping in training data set and verification data set.
Supplementary Table 7 | Correlation coefficient and p value between clinical characteristics and 8 factors.
Supplementary Table 8 | Correlation of the PS model characteristic features with small-molecule inhibitors from CTD database.
TCGA, The Cancer Genome Atlas; GEO, Gene Expression Omnibus; OS, overall survival; ROC-curve, Receiver operating characteristic curve; TNBC, triple-negative breast cancer; lncRNAs, long noncoding RNAs; DERs, differentially expressed RNAs; ceRNA, competitive endogenous RNA.
1. Garrido-Castro AC, Lin NU, Polyak K. Insights Into Molecular Classifications of Triple-Negative Breast Cancer: Improving Patient Selection for Treatment. Cancer Discov (2019) 9(2):176–98. doi: 10.1158/2159-8290.Cd-18-1177
3. Balko JM, Giltnane JM, Wang K, Schwarz LJ, Young CD, Cook RS, et al. Molecular Profiling of the Residual Disease of Triple-Negative Breast Cancers After Neoadjuvant Chemotherapy Identifies Actionable Therapeutic Targets. Cancer Discov (2014) 4(2):232–45. doi: 10.1158/2159-8290.Cd-13-0286
4. Yates LR, Gerstung M, Knappskog S, Desmedt C, Gundem G, Van Loo P, et al. Subclonal Diversification of Primary Breast Cancer Revealed by Multiregion Sequencing. Nat Med (2015) 21(7):751–9. doi: 10.1038/nm.3886
7. Yuan N, Zhang G, Bie F, Ma M, Ma Y, Jiang X, et al. Integrative Analysis of lncRNAs and miRNAs With Coding RNAs Associated With ceRNA Crosstalk Network in Triple Negative Breast Cancer. Oncol Targets Ther (2017) 10:5883–97. doi: 10.2147/ott.S149308
8. Song X, Zhang C, Liu Z, Liu Q, He K, Yu Z. Characterization of ceRNA Network to Reveal Potential Prognostic Biomarkers in Triple-Negative Breast Cancer. PeerJ (2019) 7:e7522. doi: 10.7717/peerj.7522
9. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. Limma Powers Differential Expression Analyses for RNA-sequencing and Microarray Studies. Nucleic Acids Res (2015) 43(7):e47. doi: 10.1093/nar/gkv007
10. Wang L, Cao C, Ma Q, Zeng Q, Wang H, Cheng Z, et al. RNA-Seq Analyses of Multiple Meristems of Soybean: Novel and Alternative Transcripts, Evolutionary and Functional Implications. BMC Plant Biol (2014) 14:169. doi: 10.1186/1471-2229-14-169
11. Serrao M, Chini G, Bergantino M, Sarnari D, Casali C, Conte C, et al. Identification of Specific Gait Patterns in Patients With Cerebellar Ataxia, Spastic Paraplegia, and Parkinson’s Disease: A non-Hierarchical Cluster Analysis. Hum Mov Sci (2018) 57:267–79. doi: 10.1016/j.humov.2017.09.005
13. Abbas A, Yadav V, Smith E, Ramjas E, Rutter SB, Benavidez C, et al. Computer Vision-Based Assessment of Motor Functioning in Schizophrenia: Use of Smartphones for Remote Measurement of Schizophrenia Symptomatology. Digit Biomark (2021) 5(1):29–36. doi: 10.1159/000512383
14. Paraskevopoulou MD, Vlachos IS, Karagkouni D, Georgakilas G, Kanellos I, Vergoulis T, et al. Diana-LncBase v2: Indexing microRNA Targets on non-Coding Transcripts. Nucleic Acids Res (2016) 44(D1):D231–8. doi: 10.1093/nar/gkv1270
15. Li JH, Liu S, Zhou H, Qu LH, Yang JH. starBase v2.0: Decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA Interaction Networks From Large-Scale CLIP-Seq Data. Nucleic Acids Res (2014) 42(Database issue):D92–7. doi: 10.1093/nar/gkt1248
16. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks. Genome Res (2003) 13(11):2498–504. doi: 10.1101/gr.1239303
18. Huang da W, Sherman BT, Lempicki RA. Bioinformatics Enrichment Tools: Paths Toward the Comprehensive Functional Analysis of Large Gene Lists. Nucleic Acids Res (2009) 37(1):1–13. doi: 10.1093/nar/gkn923
19. Wang P, Wang Y, Hang B, Zou X, Mao JH. A Novel Gene Expression-Based Prognostic Scoring System to Predict Survival in Gastric Cancer. Oncotarget (2016) 7(34):55343–51. doi: 10.18632/oncotarget.10533
20. Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS, et al. AutoDock4 and AutoDockTools4: Automated Docking With Selective Receptor Flexibility. J Comput Chem (2009) 30(16):2785–91. doi: 10.1002/jcc.21256
24. Chen B, Wei W, Huang X, Xie X, Kong Y, Dai D, et al. circEPSTI1 as a Prognostic Marker and Mediator of Triple-Negative Breast Cancer Progression. Theranostics (2018) 8(14):4003–15. doi: 10.7150/thno.24106
26. Jiang YZ, Liu YR, Xu XE, Jin X, Hu X, Yu KD, et al. Transcriptome Analysis of Triple-Negative Breast Cancer Reveals an Integrated Mrna-Lncrna Signature With Predictive and Prognostic Value. Cancer Res (2016) 76(8):2105–14. doi: 10.1158/0008-5472.Can-15-3284
28. Glazewska EK, Niczyporuk M, Lawicki S, Szmitkowski M, Zajkowska M, Donejko M, et al. ROC Analysis of Selected Matrix Metalloproteinases (MMPs) and Tissue Inhibitors of Metalloproteinases (TIMPs) in Psoriatic Patients. Postepy Dermatol Alergol (2018) 35(2):167–73. doi: 10.5114/ada.2018.75238
30. Nistico P, Bissell MJ, Radisky DC. Epithelial-Mesenchymal Transition: General Principles and Pathological Relevance With Special Emphasis on the Role of Matrix Metalloproteinases. Cold Spring Harb Perspect Biol (2012) 4(2):a011908. doi: 10.1101/cshperspect.a011908
31. Brew K, Nagase H. The Tissue Inhibitors of Metalloproteinases (Timps): An Ancient Family With Structural and Functional Diversity. Biochim Biophys Acta (2010) 1803(1):55–71. doi: 10.1016/j.bbamcr.2010.01.003
33. Curtis C, Shah SP, Chin SF, Turashvili G, Rueda OM, Dunning MJ, et al. The Genomic and Transcriptomic Architecture of 2,000 Breast Tumours Reveals Novel Subgroups. Nature (2012) 486(7403):346–52. doi: 10.1038/nature10983
34. Liu L, Yue Q, Ma J, Liu Y, Zhao T, Guo W, et al. POU4F1 Promotes the Resistance of Melanoma to BRAF Inhibitors Through MEK/ERK Pathway Activation and MITF Up-Regulation. Cell Death Dis (2020) 11(6):451. doi: 10.1038/s41419-020-2662-2
35. Wu D, Jia HY, Wei N, Li SJ. POU4F1 Confers Trastuzumab Resistance in HER2-positive Breast Cancer Through Regulating ERK1/2 Signaling Pathway. Biochem Biophys Res Commun (2020) 533(3):533–9. doi: 10.1016/j.bbrc.2020.09.003
37. Kim PY, Tan O, Liu B, Trahair T, Liu T, Haber M, et al. High TDP43 Expression is Required for TRIM16-induced Inhibition of Cancer Cell Growth and Correlated With Good Prognosis of Neuroblastoma and Breast Cancer Patients. Cancer Lett (2016) 374(2):315–23. doi: 10.1016/j.canlet.2016.02.021
38. Shah SP, Morin RD, Khattra J, Prentice L, Pugh T, Burleigh A, et al. Mutational Evolution in a Lobular Breast Tumour Profiled at Single Nucleotide Resolution. Nature (2009) 461(7265):809–13. doi: 10.1038/nature08489
39. Wu X, Li J, Gassa A, Buchner D, Alakus H, Dong Q, et al. Circulating Tumor DNA as an Emerging Liquid Biopsy Biomarker for Early Diagnosis and Therapeutic Monitoring in Hepatocellular Carcinoma. Int J Biol Sci (2020) 16(9):1551–62. doi: 10.7150/ijbs.44024
40. Fitzmaurice C, Abate D, Abbasi N, Abbastabar H, Abd-Allah F, Abdel-Rahman O, et al. Global, Regional, and National Cancer Incidence, Mortality, Years of Life Lost, Years Lived With Disability, and Disability-Adjusted Life-Years for 29 Cancer Groups, 1990 to 2017: A Systematic Analysis for the Global Burden of Disease Study. JAMA Oncol (2019) 5(12):1749–68. doi: 10.1001/jamaoncol.2019.2996
41. Huang YH, Zhang CZ, Huang QS, Yeong J, Wang F, Yang X, et al. Clinicopathologic Features, Tumor Immune Microenvironment and Genomic Landscape of Epstein-Barr Virus-Associated Intrahepatic Cholangiocarcinoma. J Hepatol (2020) 74(4):838–49. doi: 10.1016/j.jhep.2020.10.037
42. Shridhar K, Walia GK, Aggarwal A, Gulati S, Geetha AV, Prabhakaran D, et al. DNA Methylation Markers for Oral Pre-Cancer Progression: A Critical Review. Oral Oncol (2016) 53:1–9. doi: 10.1016/j.oraloncology.2015.11.012
43. Zhuan B, Lu Y, Chen Q, Zhao X, Li P, Yuan Q, et al. Overexpression of the Long Noncoding RNA Trhde-AS1 Inhibits the Progression of Lung Cancer Via the miRNA-103/KLF4 Axis. J Cell Biochem (2019) 120(10):17616–24. doi: 10.1002/jcb.29029
44. Pocaterra A, Santinon G, Romani P, Brian I, Dimitracopoulos A, Ghisleni A, et al. F-Actin Dynamics Regulates Mammalian Organ Growth and Cell Fate Maintenance. J Hepatol (2019) 71(1):130–42. doi: 10.1016/j.jhep.2019.02.022
45. Vigushin DM, Ali S, Pace PE, Mirsaidi N, Ito K, Adcock I, et al. Trichostatin A is a Histone Deacetylase Inhibitor With Potent Antitumor Activity Against Breast Cancer In Vivo. Clin Cancer Res (2001) 7(4):971–6.
46. Gerhardt D, Horn AP, Gaelzer MM, Frozza RL, Delgado-Cañedo A, Pelegrini AL, et al. Boldine: A Potential New Antiproliferative Drug Against Glioma Cell Lines. Invest New Drugs (2009) 27(6):517–25. doi: 10.1007/s10637-008-9203-7
47. Kazemi Noureini S, Kheirabadi M, Masoumi F, Khosrogerdi F, Zarei Y, Suárez-Rozas C, et al. Telomerase Inhibition by a New Synthetic Derivative of the Aporphine Alkaloid Boldine. Int J Mol Sci (2018) 19(4):1239. doi: 10.3390/ijms19041239
48. Revermann M, Mieth A, Popescu L, Paulke A, Wurglics M, Pellowska M, et al. A Pirinixic Acid Derivative (LP105) Inhibits Murine 5-Lipoxygenase Activity and Attenuates Vascular Remodelling in a Murine Model of Aortic Aneurysm. Br J Pharmacol (2011) 163(8):1721–32. doi: 10.1111/j.1476-5381.2011.01321.x
49. van Noort V, Schölch S, Iskar M, Zeller G, Ostertag K, Schweitzer C, et al. Novel Drug Candidates for the Treatment of Metastatic Colorectal Cancer Through Global Inverse Gene-Expression Profiling. Cancer Res (2014) 74(20):5690–9. doi: 10.1158/0008-5472.Can-13-3540
50. Shaik MS, Chatterjee A, Singh M. Effect of a Selective Cyclooxygenase-2 Inhibitor, Nimesulide, on the Growth of Lung Tumors and Their Expression of Cyclooxygenase-2 and Peroxisome Proliferator- Activated Receptor-Gamma. Clin Cancer Res (2004) 10(4):1521–9. doi: 10.1158/1078-0432.ccr-0902-03
Keywords: competitive endogenous RNA, triple-negative breast cancer, prognostic model, differentially expressed genes, nomogram
Citation: Qin W, Qi F, Li J, Li P and Zang Y-S (2021) Prognostic Biomarkers on a Competitive Endogenous RNA Network Reveals Overall Survival in Triple-Negative Breast Cancer. Front. Oncol. 11:681946. doi: 10.3389/fonc.2021.681946
Received: 17 March 2021; Accepted: 20 May 2021;
Published: 11 June 2021.
Edited by:Wenwen Zhang, Nanjing Medical University, China
Reviewed by:Lorenzo Sempere, Michigan State University, United States
Feifei Pu, Huazhong University of Science and Technology, China
Copyright © 2021 Qin, Qi, Li, Li and Zang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
†These authors have contributed equally to this work