McImpute: Matrix completion based imputation for single cell RNA-seq data

 Aanchal Mongia1, Debarka Sengupta1* and Angshul Majumdar1
  • 1Indraprastha Institute of Information Technology Delhi, India

Motivation: Single cell RNA sequencing has been proved to be revolutionary for its potential of zooming into complex biological systems. Genome wide expression analysis at single cell resolution, provides a window into dynamics of cellular phenotypes. This facilitates characterization of transcriptional heterogeneity in normal and diseased tissues under various conditions. It also sheds light on development or emergence of specific cell populations and phenotypes. However, owing to the paucity of input RNA, a typical single cell RNA sequencing data features a high number of dropout events where transcripts fail to get amplified.

Results: We introduce mcImpute, a low-rank matrix completion based technique to impute dropouts in single cell expression data. On a number of real datasets, application of mcImpute yields significant improvements in separation of true zeros from dropouts, cell-clustering, differential expression analysis, cell type separability, performance of dimensionality reduction techniques for cell visualization and gene distribution.

Keywords: Bioinformatcs, RNA, Matrix completion, Imputation, ScRNA-seq

