AUTHOR=Wang Rulan , Wang Zhuo , Li Zhongyan , Lee Tzong-Yi TITLE=Residue–Residue Contact Can Be a Potential Feature for the Prediction of Lysine Crotonylation Sites JOURNAL=Frontiers in Genetics VOLUME=Volume 12 - 2021 YEAR=2022 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2021.788467 DOI=10.3389/fgene.2021.788467 ISSN=1664-8021 ABSTRACT=Lysine crotonylation (Kcr) involved in plenty of activities in human body. Various technologies have been developed for Kcr prediction. Sequence-based features are typically adopted in existing methods, in which only linearly neighbouring amino acid composition was considered. However, modified Kcr sites are not only neighboured by the linear-neighbouring amino acid but also those spatially surrounding residues around the target site. In this paper, we have used residue-residue contact as a new feature for Kcr prediction, in which features not only encoded with linearly surrounding residues, but also those spatially nearby the target site. Then the spatial-surrounding residue was used as a new scheme for feature encoding for the first time, named residue-residue composition (RRC) and residue-residue pair composition (RRPC), which were used in supervised learning classification for Kcr prediction. As the result suggests, RRC and RRPC have achieved the best performance of RRC at an accuracy 0.77 and area under curve (AUC) value of 0.78, RRPC at an accuracy 0.74 and AUC value 0.80 respectively. In order to show that the spatial feature is of a competitively high significance as other sequence-based features, feature selection was carried on those sequence-based features together with feature RRPC. In addition, different ranges of the surrounding amino acid compositions' radii were used for comparison of the performance. After result assessment, RRC and RRPC features have shown competitively outstanding performance as others or in some cases even around 0.20 higher in accuracy or 0.3 higher in AUC values compared with sequence-based features.