AUTHOR=Liu Yinbo , Shen Yingying , Wang Hong , Zhang Yong , Zhu Xiaolei TITLE=m5Cpred-XS: A New Method for Predicting RNA m5C Sites Based on XGBoost and SHAP JOURNAL=Frontiers in Genetics VOLUME=Volume 13 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2022.853258 DOI=10.3389/fgene.2022.853258 ISSN=1664-8021 ABSTRACT=As one of the most important post-transcriptional modifications of RNA, 5-cytosine-methylation (m5C) has been reported to closely relate to many chemical reactions and biological functions in cells. Recently, several computational methods have been proposed for identifying m5C sites. However, the accuracy and efficiency are still not satisfactory. In this study, we proposed a new method, m5Cpred-XS, for predicting m5C sites of H. sapiens, M. musculus and A. thaliana. Firstly, the powerful SHAP method was used to select the optimal feature subset from seven different kinds of sequence based features. Secondly, different machine learning algorithms were used to train the models. The results of 5-fold cross-validation indicated that the model based on XGBoost achieved the highest prediction accuracy. Finally, our model was compared with other state-of-the-art models, which indicated that m5Cpred-XS was superior to other methods. Moreover, we deployed the model on a web server that can be accessed through http://m5cpred-xs.zhulab.org.cn/, and m5Cpred-XS is expected to be a useful tool for studying m5C sites.