0000000000208798

AUTHOR

Thomas Kemmer

0000-0003-1180-746x

showing 6 related works from this author

Machine learning of reverse transcription signatures of variegated polymerases allows mapping and discrimination of methylated purines in limited tra…

2020

AbstractReverse transcription (RT) of RNA templates containing RNA modifications leads to synthesis of cDNA containing information on the modification in the form of misincorporation, arrest, or nucleotide skipping events. A compilation of such events from multiple cDNAs represents an RT-signature that is typical for a given modification, but, as we show here, depends also on the reverse transcriptase enzyme. A comparison of 13 different enzymes revealed a range of RT-signatures, with individual enzymes exhibiting average arrest rates between 20 and 75%, as well as average misincorporation rates between 30 and 75% in the read-through cDNA. Using RT-signatures from individual enzymes to trai…

AdenosineAcademicSubjects/SCI00010Machine learningcomputer.software_genre[SDV.BBM.BM] Life Sciences [q-bio]/Biochemistry Molecular Biology/Molecular biologyMethylationMachine Learning03 medical and health sciences0302 clinical medicineComplementary DNA[SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]GeneticsMolecular BiologyPolymerase030304 developmental biologychemistry.chemical_classification0303 health sciencesOligoribonucleotidesGuanosinebiologybusiness.industryRNA-Directed DNA PolymeraseRNARNA-Directed DNA Polymerase[SDV.BBM.BM]Life Sciences [q-bio]/Biochemistry Molecular Biology/Molecular biologyReverse TranscriptionMethylationReverse transcriptaseEnzymechemistryTransfer RNAbiology.protein[SDV.BBM.GTP] Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]Artificial intelligenceTranscriptomebusinesscomputer030217 neurology & neurosurgery
researchProduct

Graphical Workflow System for Modification Calling by Machine Learning of Reverse Transcription Signatures

2019

Modification mapping from cDNA data has become a tremendously important approach in epitranscriptomics. So-called reverse transcription signatures in cDNA contain information on the position and nature of their causative RNA modifications. Data mining of, e.g. Illumina-based high-throughput sequencing data, is therefore fast growing in importance, and the field is still lacking effective tools. Here we present a versatile user-friendly graphical workflow system for modification calling based on machine learning. The workflow commences with a principal module for trimming, mapping, and postprocessing. The latter includes a quantification of mismatch and arrest rates with single-nucleotide re…

0301 basic medicinelcsh:QH426-470Downstream (software development)Computer scienceRT signatureMachine learningcomputer.software_genre[SDV.BBM.BM] Life Sciences [q-bio]/Biochemistry Molecular Biology/Molecular biologyField (computer science)m1A03 medical and health sciencesRNA modifications0302 clinical medicineEpitranscriptomics[SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]GeneticsTechnology and CodeGalaxy platformGenetics (clinical)ComputingMilieux_MISCELLANEOUSbusiness.industryPrincipal (computer security)[SDV.BBM.BM]Life Sciences [q-bio]/Biochemistry Molecular Biology/Molecular biologyAutomationWatson–Crick faceVisualizationlcsh:Geneticsmachine learningComputingMethodologies_PATTERNRECOGNITION030104 developmental biologyWorkflow030220 oncology & carcinogenesisMolecular Medicine[SDV.BBM.GTP] Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]TrimmingArtificial intelligencebusinesscomputer
researchProduct

NOseq: amplicon sequencing evaluation method for RNA m6A sites after chemical deamination

2020

Abstract Methods for the detection of m6A by RNA-Seq technologies are increasingly sought after. We here present NOseq, a method to detect m6A residues in defined amplicons by virtue of their resistance to chemical deamination, effected by nitrous acid. Partial deamination in NOseq affects all exocyclic amino groups present in nucleobases and thus also changes sequence information. The method uses a mapping algorithm specifically adapted to the sequence degeneration caused by deamination events. Thus, m6A sites with partial modification levels of ∼50% were detected in defined amplicons, and this threshold can be lowered to ∼10% by combination with m6A immunoprecipitation. NOseq faithfully d…

AdenosineSequence analysisAcademicSubjects/SCI00010Bisulfite sequencingDeaminationAdenosine/analogs & derivatives; Adenosine/analysis; Algorithms; Animals; Chromatography Liquid; Deamination; Drosophila melanogaster/genetics; HEK293 Cells; HeLa Cells; High-Throughput Nucleotide Sequencing/methods; Humans; RNA/chemistry; RNA Long Noncoding/chemistry; RNA Messenger/chemistry; RNA Ribosomal 18S/chemistry; Sequence Alignment; Sequence Analysis RNA/methods; Tandem Mass SpectrometrySequence alignmentComputational biologyBiology010402 general chemistry[SDV.BBM.BM] Life Sciences [q-bio]/Biochemistry Molecular Biology/Molecular biology01 natural sciencesTranscriptome03 medical and health sciencesNarese/13Tandem Mass Spectrometry[SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]GeneticsRNA Ribosomal 18SAnimalsHumansRNA MessengerComputingMilieux_MISCELLANEOUS030304 developmental biology0303 health sciencesSequence Analysis RNARNAHigh-Throughput Nucleotide Sequencing[SDV.BBM.BM]Life Sciences [q-bio]/Biochemistry Molecular Biology/Molecular biologyAmpliconRibosomal RNA0104 chemical sciencesDrosophila melanogasterHEK293 CellsDeaminationMethods OnlineRNA[SDV.BBM.GTP] Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]RNA Long NoncodingSequence AlignmentAlgorithmsChromatography LiquidHeLa Cells
researchProduct

CorCast: A Distributed Architecture for Bayesian Epidemic Nowcasting and its Application to District-Level SARS-CoV-2 Infection Numbers in Germany

2021

Timely information on current infection numbers during an epidemic is of crucial importance for decision makers in politics, medicine, and businesses. As information about local infection risk can guide public policy as well as individual behavior, such as the wearing of personal protective equipment or voluntary social distancing, statistical models providing such insights should be transparent and reproducible as well as accurate. Fulfilling these requirements is drastically complicated by the large amounts of data generated during exponential growth of infection numbers, and by the complexity of common inference pipelines. Here, we present CorCast – a stable and scalable distributed arch…

EstimationNowcastingComputer sciencePandemicBayesian probabilityInferencePublic policyStatistical modelData sciencePersonal protective equipment
researchProduct

Locality-sensitive hashing enables signal classification in high-throughput mass spectrometry raw data at scale

2021

Mass spectrometry is an important experimental technique in the field of proteomics. However, analysis of certain mass spectrometry data faces a combination of two challenges: First, even a single experiment produces a large amount of multi-dimensional raw data and, second, signals of interest are not single peaks but patterns of peaks that span along the different dimensions. The rapidly growing amount of mass spectrometry data increases the demand for scalable solutions. Existing approaches for signal detection are usually not well suited for processing large amounts of data in parallel or rely on strong assumptions concerning the signals properties. In this study, it is shown that locali…

business.industryComputer scienceScalabilityHash functionPattern recognitionDetection theoryArtificial intelligenceMass spectrometrybusinessRaw dataThresholdingSynthetic dataLocality-sensitive hashing
researchProduct

NESSie.jl – Efficient and intuitive finite element and boundary element methods for nonlocal protein electrostatics in the Julia language

2018

Abstract The development of scientific software can be generally characterized by an initial phase of rapid prototyping and the subsequent transition to computationally efficient production code. Unfortunately, most programming languages are not well-suited for both tasks at the same time, commonly resulting in a considerable extension of the development time. The cross-platform and open-source Julia language aims at closing the gap between prototype and production code by providing a usability comparable to Python or MATLAB alongside high-performance capabilities known from C and C++ in a single programming language. In this paper, we present efficient protein electrostatics computations a…

0301 basic medicineRapid prototypingGeneral Computer Sciencebusiness.industryComputer scienceComputationUsabilityPython (programming language)Finite element methodTheoretical Computer ScienceNESSIEComputational science03 medical and health sciences030104 developmental biologyModeling and SimulationbusinessMATLABBoundary element methodcomputercomputer.programming_languageJournal of Computational Science
researchProduct