Original Research ARTICLE
McImpute: Matrix completion based imputation for single cell RNA-seq data
- 1Indraprastha Institute of Information Technology Delhi, India
Motivation: Single cell RNA sequencing has been proved to be revolutionary for its potential of zooming into complex biological systems. Genome wide expression analysis at single cell resolution, provides a window into dynamics of cellular phenotypes. This facilitates characterization of transcriptional heterogeneity in normal and diseased tissues under various conditions. It also sheds light on development or emergence of specific cell populations and phenotypes. However, owing to the paucity of input RNA, a typical single cell RNA sequencing data features a high number of dropout events where transcripts fail to get amplified.
Results: We introduce mcImpute, a low-rank matrix completion based technique to impute dropouts in single cell expression data. On a number of real datasets, application of mcImpute yields significant improvements in separation of true zeros from dropouts, cell-clustering, differential expression analysis, cell type separability, performance of dimensionality reduction techniques for cell visualization and gene distribution.
Availability and Implementation: https://github.com/aanchalMongia/McImpute_scRNAseq
Keywords: Bioinformatcs, RNA, Matrix completion, Imputation, ScRNA-seq
Received: 08 Aug 2018;
Accepted: 10 Jan 2019.
Edited by:Indrajit Saha, Department of Computer Science and Engineering, National Institute of Technical Teachers' Training and Research, India
Reviewed by:Yuriy L. Orlov, Institute of Cytology and Genetics, Russian Academy of Sciences, Russia
Sumit K. Bag, National Botanical Research Institute (CSIR), India
Kumardeep Chaudhary, Icahn School of Medicine at Mount Sinai, United States
Shaoli Das, National Institutes of Health (NIH), United States
Copyright: © 2019 Mongia, Sengupta and Majumdar. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Dr. Debarka Sengupta, Indraprastha Institute of Information Technology Delhi, Delhi, India, firstname.lastname@example.org