AUTHOR=Zou Bin , Zhang Tongda , Zhou Ruilong , Jiang Xiaosen , Yang Huanming , Jin Xin , Bai Yong TITLE=deepMNN: Deep Learning-Based Single-Cell RNA Sequencing Data Batch Correction Using Mutual Nearest Neighbors JOURNAL=Frontiers in Genetics VOLUME=Volume 12 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2021.708981 DOI=10.3389/fgene.2021.708981 ISSN=1664-8021 ABSTRACT=It is well recognized that batch effect in single-cell RNA sequencing (scRNA-seq) data remains a big challenge when integrating different datasets. Here, we proposed deepMNN, a novel deep learning-based method to correct batch effect in scRNA-seq data. We first searched mutual nearest neighbor (MNN) pairs across different batches in a principal component analysis (PCA) subspace. Subsequently, a residual network was constructed by stacking two residual blocks and further applied for batch correction. We devised a batch loss that minimized the distance between MNN pairs in the PCA subspace, which allowed for integrating multiple scRNA-seq datasets with batch effects in one step. Experiment results showed that deepMNN can successfully align different datasets under different scenarios: identical cell types, non-identical cell types, multiple batches, and large datasets. We compared the batch correction performance of deepMNN with Scanorama, a well-known batch correction method. Results demonstrated that deepMNN outperformed Scanorama in terms of quantitative metrics of batch/cell entropies and average silhouette width (ASW) F1 score. Furthermore, deepMNN also ran much faster for large datasets, which could make deepMNN a new choice for large-scale single-cell gene expression data analysis.