Original Research ARTICLE
Identification of Breast Cancer subtype specific MicroRNAs using Survival Analysis to find their role in transcriptomic regulation
- 1Centre of New Technologies, University of Warsaw, Poland
- 2College of Inter-Faculty Individual Studies in Mathematics and Natural Sciences, University of Warsaw, Poland
- 3Department of Computer Science and Engineering, National Institute of Technical Teachers' Training and Research, India
- 4Department of Computer Science & Engineering, Jadavpur University, India
- 5University of Warsaw, Poland
- 6Faculty of Mathematics and Information Science, Warsaw University of Technology, Poland
The microRNA (miRNA) biomolecules have a significant role in the development of breast cancer, and their expression profile is different in each subtype of breast cancer.
Thus, our goal is to use the Next Generation Sequencing provided high-throughput miRNA expression and clinical data in an integrated fashion to perform survival analysis in order to identify breast cancer subtype specific miRNAs, and analyze associated genes and transcription factors.
We select top 100 miRNAs for each of the four subtypes, based on the value of hazard ratio and p-value, thereafter, identify 44 miRNAs that are related to all four subtypes, which we call as 4-star miRNAs.
Moreover, 12, 14, 9 and 15 subtype specific, viz. 1-star miRNAs, are also identified.
The resulting miRNAs are validated by using machine learning methods to differentiate tumor cases from controls (for 4-star miRNAs), and subtypes (for 1-star miRNAs).
The 4-star miRNAs provide 95% average accuracy, while in case of 1-star miRNAs 81% accuracy is achieved for HER2-Enriched.
Differences in expression of miRNAs between cancer stages is also analyzed, and a subset of 8 miRNAs is found, for which expression is increased in stage II relative to stage I, including hsa-miR-10b-5p, which contributes to breast cancer metastasis.
Subsequently we prepare regulatory networks in order to identify the interactions among miRNAs, their targeted genes and transcription factors (TFs), that are targeting those miRNAs. In this way, key regulatory circuits are identified, where genes such as TP53, ESR1, BRCA1, MYC and others, that are known to be important genetic factors for the cause of breast cancer, produce transcription factors that target the same genes as well as interact with the selected miRNAs.
To provide further biological validation the Protein-Protein Interaction (PPI) networks are prepared and KEGG pathway and GO enrichment analysis are performed. Among the enriched pathways many are breast cancer-related, such as PI3K-Akt or p53 signaling pathways, and contain proteins such as TP53, also present in the regulatory networks. Moreover, we find that the genes are enriched in GO terms associated with breast cancer. Our results provide detailed analysis of selected miRNAs and their regulatory networks.
Keywords: breast cancer, Kaplan–Meier estimator, miRNA-seq, Nelson-Aalen estimator, protein-protein interaction, regulatory circuit, survival analysis
Received: 20 Dec 2018;
Accepted: 30 Sep 2019.
Copyright: © 2019 Denkiewicz, Saha, Rakshit, Prasad Sarkar and Plewczynski. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
Dr. Indrajit Saha, Department of Computer Science and Engineering, National Institute of Technical Teachers' Training and Research, Kolkata, 700106, West Bengal, India, email@example.com
Prof. Dariusz Plewczynski, University of Warsaw, Warsaw, 00-927, Masovian, Poland, firstname.lastname@example.org