AUTHOR=Liu Qiaoming , Liang Yingjian , Wang Dong , Li Jie TITLE=LFSC: A linear fast semi-supervised clustering algorithm that integrates reference-bulk and single-cell transcriptomes JOURNAL=Frontiers in Genetics VOLUME=Volume 13 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2022.1068075 DOI=10.3389/fgene.2022.1068075 ISSN=1664-8021 ABSTRACT=Identification of cell type in complex tissues is an important step toward the research of cellular heterogeneity of the disease. We present LFSC, a linear fast semi-supervised clustering algorithm that utilizes reference samples generated from bulk RNA-seq data to identify cell types from single-cell transcriptomes. An anchor graph is constructed to depict the relationship between reference samples and cells. By employing a connectivity constraint to the learned graph, LFSC enables the preservation of the underlying cluster structure. Moreover, the overall complexity of LFSC is linear to the size of the data, which improves a lot in effectiveness and efficiency. By applying LFSC to real scRNA-seq datasets, we discovered LFSC has superior performance over existing baseline methods in clustering accuracy and robustness. Applications on infiltrating T Cells in liver cancer demonstrate LFSC can successfully find new cell types, discover differently expressed genes and explore new cancer-associated biomarkers.