Original Research ARTICLE
Improved circRNA identification by combining prediction algorithms
- 1MBG, Aarhus University, Denmark
Non-coding RNA is an interesting class of gene regulators with diverse functionalities. One large subgroup of non-coding RNAs is the recently discovered class of circular RNAs (circRNAs). CircRNAs are conserved and expressed in a tissue and developmental specific manner, although for the vast majority, the functional relevance remains unclear. To identify and quantify circRNAs expression, several bioinformatic pipelines have been developed to assess the catalogue of circRNAs in any given total RNA sequencing dataset. We recently compared five different algorithms for circRNA detection, but here this analysis is extended to 11 algorithms. By comparing the number of circRNAs discovered and their respective sensitivity to RNaseR digestion, the sensitivity and specificity of each algorithm are evaluated. Moreover, the ability to predict de novo circRNA, i.e. circRNAs not derived from annotated splice sites, is also determined as well as the effect of eliminating low quality and adaptor-containing reads prior to circRNA prediction. Finally, and most importantly, all possible pair-wise combinations of algorithms are tested and guidelines for algorithm complementarity are provided. Conclusively, the algorithms mostly agree on highly expressed circRNAs, however, in many cases, algorithm-specific false positives with high read counts are predicted, which is resolved by using the shared output from two (or more) algorithms.
Keywords: non-coding RNA, circular RNA, gene prediction, bioinformatics, combining algorithms
Received: 17 Nov 2017;
Accepted: 12 Feb 2018.
Edited by:Argyris Papantonis, Universität zu Köln, Germany
Reviewed by:Robert Lyle, University of Oslo, Norway
Christoforos Nikolaou, University of Crete, Greece
Copyright: © 2018 Hansen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
* Correspondence: PhD. Thomas B. Hansen, Aarhus University, MBG, C.F. Moellers Alle 3, build 1130, Aarhus, 8000, Denmark, firstname.lastname@example.org