Impact Factor 3.517 | CiteScore 3.60
More on impact ›

Technology and Code ARTICLE Provisionally accepted The full-text will be published soon. Notify me

Front. Genet. | doi: 10.3389/fgene.2019.00876

Graphical workflow system for modification calling by machine learning of reverse transcription signatures

 Lukas Schmidt1,  Stephan Werner1, Thomas Kemmer2, Stefan Niebler2, Lilia Ayadi3, Patrick Johe1, Virginie Marchand3, Tanja Schirmeister1,  Yuri MOTORIN3, Andreas Hildebrandt2*, Bertil Schmidt2* and  Mark Helm1*
  • 1Institute of Pharmacy and Biochemistry - Therapeutic Life Sciences, Faculty of Chemistry, Pharmacy and Geosciences, Johannes Gutenberg University Mainz, Germany
  • 2Institute of Computer Science, Faculty of Physics, Mathematics and Computer Science, Johannes Gutenberg University Mainz, Germany
  • 3Next-Generation Sequencing Core Facility UMS2008 IBSLor CNRS-UL-INSERM, Biopôle, University of Lorraine, Vandœuvre-lès-Nancy, France, UMR7365 Ingénierie Moléculaire et Physiopathologie Articulaire (IMOPA), France

Modification mapping from cDNA data has become a tremendously important approach in epitranscriptomics. So-called reverse transcription signatures in cDNA contain information on the position and nature of their causative RNA modifications. Data mining of e.g. Illumina-based high throughput sequencing data is therefore fast growing in importance, and the field is still lacking effective tools. Here we present a versatile user-friendly graphical workflow system for modification calling based on machine learning. The workflow commences with a principle module for trimming, mapping and post-processing. The latter includes a quantification of mismatch and arrest rates with single nucleotide resolution across the mapped transcriptome. Further downstream modules include tools for visualization, machine-learning, and modification calling. From the machine-learning module, quality assessment parameters are provided to gauge the suitability of the initial dataset for effective machine learning and modification calling. This output is useful to improve the experimental parameters for library preparation and sequencing. In summary, the automation of the bioinformatics workflow allows a faster turnaround of the optimization cycles in modification calling.

Keywords: RT-signature, Watson-Crick face, Galaxy Platform, RNA modifications, machine learning, m1A

Received: 17 Jun 2019; Accepted: 21 Aug 2019.

Copyright: © 2019 Schmidt, Werner, Kemmer, Niebler, Ayadi, Johe, Marchand, Schirmeister, MOTORIN, Hildebrandt, Schmidt and Helm. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence:
Prof. Andreas Hildebrandt, Institute of Computer Science, Faculty of Physics, Mathematics and Computer Science, Johannes Gutenberg University Mainz, Mainz, 55128, Rhineland-Palatinate, Germany, Andreas.Hildebrandt@uni-mainz.de
Prof. Bertil Schmidt, Institute of Computer Science, Faculty of Physics, Mathematics and Computer Science, Johannes Gutenberg University Mainz, Mainz, 55128, Rhineland-Palatinate, Germany, bertil.schmidt@uni-mainz.de
Prof. Mark Helm, Institute of Pharmacy and Biochemistry - Therapeutic Life Sciences, Faculty of Chemistry, Pharmacy and Geosciences, Johannes Gutenberg University Mainz, Mainz, 55128, Rhineland-Palatinate, Germany, mhelm@uni-mainz.de