Impact Factor 3.517 | CiteScore 3.60
More on impact ›

Original Research ARTICLE Provisionally accepted The full-text will be published soon. Notify me

Front. Genet. | doi: 10.3389/fgene.2019.00695

Identification of potential crucial genes and key pathways in breast cancer using bioinformatic analysis

 Jun L. Deng1, Yun H. Xu1, Guo Wang1* and Yuan S. Zhu2*
  • 1Xiangya Hospital, Central South University, China
  • 2Weill Cornell Medicine, Cornell University, United States

Background: The molecular mechanism of tumorigenesis remains to be fully understood in breast cancer. It is urgently required to identify genes that are associated with breast cancer development and prognosis and to elucidate the underlying molecular mechanisms. In the present study, we aimed to identify potential pathogenic and prognostic differentially expressed genes (DEGs) in breast adenocarcinoma through bioinformatic analysis of public datasets.
Methods: Four datasets (GSE21422, GSE29431, GSE42568 and GSE61304) from Gene Expression Omnibus (GEO) and the TCGA dataset were used for the bioinformatics analysis. DEGs were identified using LIMMA Package of R. The GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) analyses were conducted through FunRich. The protein-protein interaction (PPI) network of the DEGs was established through STRING (Search Tool for the Retrieval of Interacting Genes database) website, visualized by Cytoscape and further analyzed by Molecular Complex Detection (MCODE). UALCAN and Kaplan Meier-plotter (KM) were employed to analyze the expression levels and prognostic values of hub genes. The expression levels of the hub genes were also validated in clinical samples from breast cancer patients. In addition, the gene-drug interaction network was constructed using Comparative Toxicogenomics Database (CTD).
Results: 203 up-regulated and 118 down-regulated DEGs were identified. Mitotic cell cycle and epithelial-to-mesenchymal transition pathway were the major enriched pathways for the up-regulated and down-regulated genes, respectively. The PPI network was constructed with 314 nodes and 1810 interactions and two significant modules are selected. The most significant enriched pathway in module 1 was the mitotic cell cycle. Moreover, six hub genes were selected and validated in clinical sample for further analysis owing to the high degree of connectivity, including CDK1, CCNA2, TOP2A, CCNB1, KIF11 and MELK, and they were all correlated to worse OS in breast cancer.
Conclusion: These results revealed that mitotic cell cycle and epithelial-to-mesenchymal transition pathway could be potential pathways accounting for the progression in breast cancer, and CDK1, CCNA2, TOP2A, CCNB1, KIF11 and MELK may be potential crucial genes. Further, it could be utilized as new biomarkers for prognosis and potential new targets for drug synthesis of breast cancer.

Keywords: breast cancer, GEO, TCGA, Differentially expressed genes, bioinformatics, Survival, biomarker

Received: 11 Apr 2019; Accepted: 02 Jul 2019.

Edited by:

Marco Pellegrini, Institute of Computer Science and Telematics (IIT), Italy

Reviewed by:

Hao Zhang, Jilin University, China
Paola Ferrari, Pisana University Hospital, Italy  

Copyright: © 2019 Deng, Xu, Wang and Zhu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Prof. Guo Wang, Xiangya Hospital, Central South University, Changsha, China, 207082@csu.edu.cn
Prof. Yuan S. Zhu, Weill Cornell Medicine, Cornell University, White Plains, 10065, New York, United States, yuz2002@med.cornell.edu