Impact Factor 3.517 | CiteScore 3.60
More on impact ›

Original Research ARTICLE Provisionally accepted The full-text will be published soon. Notify me

Front. Genet. | doi: 10.3389/fgene.2019.00769

Inferring Latent Disease-lncRNA Associations by Faster Matrix Completion on a Heterogeneous Network

 Wen Li1, 2,  Shulin Wang2*,  JunLin Xu2, Guo Mao2, Geng Tian3 and  Jialiang Yang3*
  • 1Other, China
  • 2Hunan University, China
  • 3Geneis (Beijing) Co. Ltd, China

Current studies have shown that long non-coding RNAs (lncRNAs) play a crucial role in a variety of important fundamental biological processes related to complex human diseases. The prediction of latent disease-lncRNA associations can help to understand the pathogenesis of complex human diseases at the level of lncRNA, which also contributes to the detection of disease biomarkers, and the diagnosis, treatment, prognosis, and prevention of disease. Nevertheless, it is still a challenging and urgent task to accurately identify the latent disease-lncRNA association. Discovering latent links on the basis of biological experiments is time-consuming and wasteful, which necessitates the development of computational prediction models. In this study, a computational prediction model has been remodeled as a matrix completion framework of the recommendation system by completing the unknown items in the rating matrix. A novel method named faster randomized matrix completion for latent disease-lncRNA association prediction (FRMCLDA) has been proposed by virtue of improved randomized partial SVD on a heterogeneous bilayer network. First, the correlated data source and experimentally validated information of diseases and lncRNAs are integrated to construct a heterogeneous bilayer network. Next, the integrated heterogeneous bilayer network can be formalized as a comprehensive adjacency matrix which includes lncRNA similarity matrix, disease similarity matrix, and disease-lncRNA association matrix where the uncertain disease-lncRNA associations are referred to as blank items. Then, a matrix approximate to the original adjacency matrix has been designed with predicted scores to retrieve the blank items. The construction of the approximate matrix could be equivalently resolved by the nuclear norm minimization. Finally, a faster singular value thresholding algorithm with a randomized partial SVD combing a new sub-space reuse technique has been utilized to complete the adjacency matrix. The results of leave-one-out cross-validation (LOOCV) experiments and 5-fold cross-validation (5-fold CV) experiments on three different benchmark databases have confirmed the availability and adaptability of FRMCLDA in inferring latent relationships of disease-lncRNA pairs, and in inferring lncRNAs correlated with novel diseases without any prior interaction information. Besides, case studies have shown that FRMCLDA is able to effectively predict latent lncRNAs correlated with three widespread malignancies: prostate cancer, colon cancer, and gastric cancer.

Keywords: Heterogeneous bilayer network, association prediction, Matrix completion, faster SVT, Randomized partial SVD, Similarity measurements

Received: 21 May 2019; Accepted: 19 Jul 2019.

Edited by:

Peilin Jia, University of Texas Health Science Center, United States

Reviewed by:

Xin Zhou, Stanford University, United States
Lu Zhang, Hong Kong Baptist University, Hong Kong  

Copyright: © 2019 Li, Wang, Xu, Mao, Tian and Yang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Prof. Shulin Wang, Hunan University, Changsha, 410082, Hunan Province, China,
Dr. Jialiang Yang, Geneis (Beijing) Co. Ltd, Beijing, 100006, China,