Original Research ARTICLE
iDNA6mA-Rice: a computational tool for detecting N6-methyladenine sites in rice
- 1University of Electronic Science and Technology of China, China
- 2Chengdu University of Traditional Chinese Medicine, China
DNA N6-methyladenine (6mA) is a prevalent kind of DNA modification and involves in various of biological processes. Accurate genome-wide identification of 6mA sites is invaluable for better understanding its biological functions. Due to the labor-intensive and expensive nature of experimental methods for 6mA detection in eukaryotes genome, it is urgent to develop computational methods to identify 6mA genome wide, especially for plants. Based on this consideration, the current study was devoted to construct a machine learning-based method to predict 6mA in the rice genome. We initially proposed using mono-nucleotide binary encoding to formulate positive and negative samples. Subsequently, the machine learning algorithm named Random Forest was utilized to perform the classification for identifying 6mA sites in the rice genome. The five-fold cross-validated results showed that our proposed method could produce an area under the receiver operating characteristic (AUC) of 0.964 with the overall accuracy of 0.917. Furthermore, an independent dataset test was established to evaluate the generalization ability of our method. As a result, an AUC of 0.981 was obtained, suggesting that the proposed method has good predictive performance to predict 6mA in rice. For the convenience of retrieving 6mA sites, based on the proposed method, we established a web-server called iDNA6mA-Rice which is freely accessible at http://lin-group.cn/server/iDNA6mA-Rice.
Keywords: N6-methyladenine, Mono-nucleotide binary encoding, random forest, Web-server, Cross-validation
Received: 13 Jun 2019;
Accepted: 26 Jul 2019.
Copyright: © 2019 Hao, Dao, Guan, Zhang, Tan, Zhang, Chen and Lin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: Prof. Hao Lin, University of Electronic Science and Technology of China, Chengdu, 610054, Sichuan Province, China, firstname.lastname@example.org