AUTHOR=Yuan Musu , Chen Liang , Deng Minghua TITLE=Clustering CITE-seq data with a canonical correlation-based deep learning method JOURNAL=Frontiers in Genetics VOLUME=Volume 13 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2022.977968 DOI=10.3389/fgene.2022.977968 ISSN=1664-8021 ABSTRACT=Single-cell multi-omics sequencing techniques have rapidly developed in the past few years. Among these techniques, single-cell Cellular Indexing of Transcriptomes and Epitopes (CITE-seq) allows simultaneous quantification of gene expression and surface proteins. Clustering CITE-seq data has great potential of providing us with a more comprehensive and in-depth view of cell states and interactions. However, CITE-seq data inherits the properties of scRNA-seq data, being noisy, large dimensional and highly sparse. Moreover, representations of RNA and surface protein are sometimes with low correlation and contribute divergently to the clustering object. To overcome these obstacles and find a combined representation well-suited for clustering, we proposed scCTClust for multi-omics data, especially CITE-seq data, clustering analysis. Two omics-specific neural networks are introduced to extract cluster information from omics data. A deep canonical correlation method is adopted to find the maximumly correlated representations of two omics. A novel decentralized clustering method is utilized over the linear combination of latent representations of two omics. The fusion weights which can account for contributions of omics to clustering are adaptively updated during training. Extensive experiments, over both simulated and real CITE-seq datasets demonstrated the power of scCTClust. We also applied scCTClust on transcriptome-epigenome data to illustrate its potential of generalizing.