AUTHOR=Zhang Guishan , Dai Zhiming , Dai Xianhua TITLE=A Novel Hybrid CNN-SVR for CRISPR/Cas9 Guide RNA Activity Prediction JOURNAL=Frontiers in Genetics VOLUME=Volume 10 - 2019 YEAR=2020 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2019.01303 DOI=10.3389/fgene.2019.01303 ISSN=1664-8021 ABSTRACT=Accurate prediction of guide RNA (gRNA) on-target efficacy is critical for effective application of CRISPR/Cas9 system. Although some machine learning-based and convolutional neural network (CNN)-based methods have been proposed, prediction accuracy remains to be improved. Here, we proposed a novel hybrid system which combines CNNs with support vector regression (SVR) for predicting gRNA on-target efficacy. This CNN-SVR system is composed of two major components: a merged CNN as the front-end for extracting gRNA feature and an SVR as the back-end for regression and predicting gRNA cleavage efficiency. Specifically, we trained the merged CNNs model from scratch on benchmark dataset for model selection and pre-training. Subsequently, we utilized a two-step feature optimization strategy based on average area under ROC curve value to extract the most important features. Using the learnt representative features, we trained the SVR model for gRNA on-target activity prediction. Besides, we developed a transfer learning strategy to train our framework on the benchmark dataset and applied it on small sample cell line specific datasets. We demonstrate that CNN-SVR can effectively exploit features interactions from feed-forward directions to learn deeper features of gRNAs and their corresponding epigenetic features. Numerical experiments on commonly used datasets show our CNN-SVR system outperform available state-of-the-art methods in terms of prediction accuracy, generalization and robustness. Source codes are available at https://github.com/Peppags/CNN-SVR.