AUTHOR=Zhu Yuan , Zhang De-Xin , Zhang Xiao-Fei , Yi Ming , Ou-Yang Le , Wu Mengyun TITLE=EC-PGMGR: Ensemble Clustering Based on Probability Graphical Model With Graph Regularization for Single-Cell RNA-seq Data JOURNAL=Frontiers in Genetics VOLUME=Volume 11 - 2020 YEAR=2020 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2020.572242 DOI=10.3389/fgene.2020.572242 ISSN=1664-8021 ABSTRACT=Advances in technology have made it convenient to obtain a large amount of single cell RNA sequencing (scRNA-seq) data. Recently, some computational methods are proposed to identify or define cellular phenotypes by using these data sets which can reflect heterogeneity of cells. Considering that the key technical problem is data clustering, many approaches including individual and integrative clustering methods are proposed for further analysis. However, the results of many individual clustering methods depend on the initial parameter settings, such as cluster number, distance metric and so on. Besides, the results of integrative clustering methods are influenced by the results of basis clustering (Here every clustering algorithm in ensemble clustering algorithm is considered as a basis clustering method). Particularly, integrative or ensemble clustering methods, which combines two or more individual clustering methods, are devoted to get much more accurate results. As such, it is a challenge to design a method that can robustly and effectively integrate different kinds of methods for cell clustering with scRNA-seq. We propose EC-PGMGR, that is, an Ensemble Clustering algorithm based on Probability Graphical Model with Graph Regularization. On one hand, we use parameter controlling in Probability Graphical Model (PGM) to automatically determine the cluster number without prior knowledge. On the other hand, we add a regularization term to reduce the limitation of basis clustering result. Experiments are carried out on six data sets with the number of single cells from 822 to 3605. Results show that our EC-PGMGR performs better than four alternative individual clustering methods and two ensemble methods in terms of Adjusted Rand Index (ARI), robustness, effectiveness and so on. The results are bio-explanatory, which could be useful to the wide application of clustering.