Graphical Workflow System for Modification Calling by Machine Learning of Reverse Transcription Signatures

6533b824fe1ef96bd1280012

RESEARCH PRODUCT

Graphical Workflow System for Modification Calling by Machine Learning of Reverse Transcription Signatures

Lukas Schmidt Stephan Werner Thomas Kemmer Stefan Niebler Marco Kristen Lilia Ayadi Lilia Ayadi Patrick Johe Virginie Marchand Tanja Schirmeister Yuri Motorin Yuri Motorin Andreas Hildebrandt Bertil Schmidt Mark Helm

subject

0301 basic medicine lcsh:QH426-470 Downstream (software development)Computer science RT signature Machine learning computer.software_genre [SDV.BBM.BM] Life Sciences [q-bio]/Biochemistry Molecular Biology/Molecular biology Field (computer science)m1A 03 medical and health sciences RNA modifications 0302 clinical medicine Epitranscriptomics [SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]Genetics Technology and Code Galaxy platform Genetics (clinical)ComputingMilieux_MISCELLANEOUS business.industry Principal (computer security)[SDV.BBM.BM]Life Sciences [q-bio]/Biochemistry Molecular Biology/Molecular biology Automation Watson–Crick face Visualization lcsh:Genetics machine learning ComputingMethodologies_PATTERNRECOGNITION 030104 developmental biology Workflow 030220 oncology & carcinogenesis Molecular Medicine [SDV.BBM.GTP] Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]Trimming Artificial intelligence business computer

description

Modification mapping from cDNA data has become a tremendously important approach in epitranscriptomics. So-called reverse transcription signatures in cDNA contain information on the position and nature of their causative RNA modifications. Data mining of, e.g. Illumina-based high-throughput sequencing data, is therefore fast growing in importance, and the field is still lacking effective tools. Here we present a versatile user-friendly graphical workflow system for modification calling based on machine learning. The workflow commences with a principal module for trimming, mapping, and postprocessing. The latter includes a quantification of mismatch and arrest rates with single-nucleotide resolution across the mapped transcriptome. Further downstream modules include tools for visualization, machine learning, and modification calling. From the machine-learning module, quality assessment parameters are provided to gauge the suitability of the initial dataset for effective machine learning and modification calling. This output is useful to improve the experimental parameters for library preparation and sequencing. In summary, the automation of the bioinformatics workflow allows a faster turnaround of the optimization cycles in modification calling.

year	journal	country	edition	language
2019-09-25

10.3389/fgene.2019.00876 https://hal.univ-lorraine.fr/hal-02317684