Search results for "Theoretical Computer Science"

showing 10 items of 1151 documents

Discovering Differential Equations from Earth Observation Data

2020

Modeling and understanding the Earth system is a constant and challenging scientific endeavour. When a clear mechanistic model is unavailable, complex or uncertain, learning from data can be an alternative. While machine learning has provided excellent methods for detection and retrieval, understanding the governing equations of the system from observational data seems an elusive problem. In this paper we introduce sparse regression to uncover a set of governing equations in the form of a system of ordinary differential equations (ODEs). The presented method is used to explicitly describe variable relations by identifying the most expressive and simplest ODEs explaining data to model releva…

0301 basic medicineEarth observationTheoretical computer scienceComputer scienceDifferential equationOde020206 networking & telecommunications02 engineering and technologyData modeling03 medical and health sciences030104 developmental biologyOrdinary differential equation0202 electrical engineering electronic engineering information engineeringConstant (mathematics)Variable (mathematics)IGARSS 2020 - 2020 IEEE International Geoscience and Remote Sensing Symposium
researchProduct

The colored longest common prefix array computed via sequential scans

2018

Due to the increased availability of large datasets of biological sequences, the tools for sequence comparison are now relying on efficient alignment-free approaches to a greater extent. Most of the alignment-free approaches require the computation of statistics of the sequences in the dataset. Such computations become impractical in internal memory when very large collections of long sequences are considered. In this paper, we present a new conceptual data structure, the colored longest common prefix array (cLCP), that allows to efficiently tackle several problems with an alignment-free approach. In fact, we show that such a data structure can be computed via sequential scans in semi-exter…

0301 basic medicineFOS: Computer and information sciencesAlignment-free methodsBurrows–Wheeler transformComputer scienceComputationAverage common substring0206 medical engineeringMatching statisticsScale (descriptive set theory)02 engineering and technologyTheoretical Computer Science03 medical and health sciencesComputer Science - Data Structures and AlgorithmsData Structures and Algorithms (cs.DS)Burrows-wheeler transformString (computer science)Computer Science (all)LCP arrayMatching statisticData structureSubstring030104 developmental biologyAlignment-free methods; Average common substring; Burrows-wheeler transform; Longest common prefix; Matching statistics; Theoretical Computer Science; Computer Science (all)Pairwise comparisonLongest common prefixAlgorithm020602 bioinformaticsAlignment-free method
researchProduct

Alignment-free sequence comparison using absent words

2018

Sequence comparison is a prerequisite to virtually all comparative genomic analyses. It is often realised by sequence alignment techniques, which are computationally expensive. This has led to increased research into alignment-free techniques, which are based on measures referring to the composition of sequences in terms of their constituent patterns. These measures, such as $q$-gram distance, are usually computed in time linear with respect to the length of the sequences. In this paper, we focus on the complementary idea: how two sequences can be efficiently compared based on information that does not occur in the sequences. A word is an {\em absent word} of some sequence if it does not oc…

0301 basic medicineFOS: Computer and information sciencesFormal Languages and Automata Theory (cs.FL)Computer Science - Formal Languages and Automata TheorySequence alignmentInformation System0102 computer and information sciencesCircular wordAbsent words01 natural sciencesUpper and lower boundsSequence comparisonTheoretical Computer ScienceCombinatorics03 medical and health sciencesComputer Science - Data Structures and AlgorithmsData Structures and Algorithms (cs.DS)Absent wordCircular wordsMathematicsSequenceSettore INF/01 - InformaticaProcess (computing)q-gramComputer Science Applications1707 Computer Vision and Pattern Recognitionq-gramsComposition (combinatorics)Computer Science Applications030104 developmental biologyComputational Theory and MathematicsForbidden words010201 computation theory & mathematicsFocus (optics)Forbidden wordWord (computer architecture)Information SystemsInteger (computer science)
researchProduct

Measuring the clustering effect of BWT via RLE

2017

Abstract The Burrows–Wheeler Transform (BWT) is a reversible transformation on which are based several text compressors and many other tools used in Bioinformatics and Computational Biology. The BWT is not actually a compressor, but a transformation that performs a context-dependent permutation of the letters of the input text that often create runs of equal letters (clusters) longer than the ones in the original text, usually referred to as the “clustering effect” of BWT. In particular, from a combinatorial point of view, great attention has been given to the case in which the BWT produces the fewest number of clusters (cf. [5] , [16] , [21] , [23] ). In this paper we are concerned about t…

0301 basic medicineGeneral Computer SciencePermutationComputer Science (all)Binary number0102 computer and information sciencesQuantitative Biology::Genomics01 natural sciencesUpper and lower boundsTheoretical Computer ScienceCombinatorics03 medical and health sciencesPermutation030104 developmental biologyTransformation (function)BWT010201 computation theory & mathematicsRun-length encodingComputer Science::Data Structures and AlgorithmsCluster analysisPrimitive root modulo nBWT; Permutation; Run-length encoding; Theoretical Computer Science; Computer Science (all)Word (computer architecture)Run-length encodingMathematics
researchProduct

Linear-time sequence comparison using minimal absent words & applications

2016

Sequence comparison is a prerequisite to virtually all comparative genomic analyses. It is often realized by sequence alignment techniques, which are computationally expensive. This has led to increased research into alignment-free techniques, which are based on measures referring to the composition of sequences in terms of their constituent patterns. These measures, such as q-gram distance, are usually computed in time linear with respect to the length of the sequences. In this article, we focus on the complementary idea: how two sequences can be efficiently compared based on information that does not occur in the sequences. A word is an absent word of some sequence if it does not occur in…

0301 basic medicineLatin AmericansComputer Science (all)Library science0102 computer and information sciencesCircular wordAlgorithms on string01 natural sciencesAlignmentfree comparisonSequence comparisonTheoretical Computer Science03 medical and health sciences030104 developmental biology010201 computation theory & mathematicsInformaticsPolitical scienceAbsent wordForbidden word
researchProduct

Centrality in Complex Networks with Overlapping Community Structure

2019

AbstractIdentifying influential spreaders in networks is an essential issue in order to prevent epidemic spreading, or to accelerate information diffusion. Several centrality measures take advantage of various network topological properties to quantify the notion of influence. However, the vast majority of works ignore its community structure while it is one of the main features of many real-world networks. In a recent study, we show that the centrality of a node in a network with non-overlapping communities depends on two features: Its local influence on the nodes belonging to its community, and its global influence on the nodes belonging to the other communities. Using global and local co…

0301 basic medicineMultidisciplinaryTheoretical computer scienceSocial networkbusiness.industryComputer scienceScienceQRCommunity structure[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]Complex networkApplied mathematicsComputer scienceArticle03 medical and health sciences030104 developmental biology0302 clinical medicineNode (computer science)MedicinebusinessEpidemic modelCentrality030217 neurology & neurosurgeryScientific Reports
researchProduct

NESSie.jl – Efficient and intuitive finite element and boundary element methods for nonlocal protein electrostatics in the Julia language

2018

Abstract The development of scientific software can be generally characterized by an initial phase of rapid prototyping and the subsequent transition to computationally efficient production code. Unfortunately, most programming languages are not well-suited for both tasks at the same time, commonly resulting in a considerable extension of the development time. The cross-platform and open-source Julia language aims at closing the gap between prototype and production code by providing a usability comparable to Python or MATLAB alongside high-performance capabilities known from C and C++ in a single programming language. In this paper, we present efficient protein electrostatics computations a…

0301 basic medicineRapid prototypingGeneral Computer Sciencebusiness.industryComputer scienceComputationUsabilityPython (programming language)Finite element methodTheoretical Computer ScienceNESSIEComputational science03 medical and health sciences030104 developmental biologyModeling and SimulationbusinessMATLABBoundary element methodcomputercomputer.programming_languageJournal of Computational Science
researchProduct

parSRA: A framework for the parallel execution of short read aligners on compute clusters

2018

The growth of next generation sequencing datasets poses as a challenge to the alignment of reads to reference genomes in terms of both accuracy and speed. In this work we present parSRA, a parallel framework to accelerate the execution of existing short read aligners on distributed-memory systems. parSRA can be used to parallelize a variety of short read alignment tools installed in the system without any modification to their source code. We show that our framework provides good scalability on a compute cluster for accelerating the popular BWA-MEM and Bowtie2 aligners. On average, it is able to accelerate sequence alignments on 16 64-core nodes (in total, 1024 cores) with speedup of 10.48 …

0301 basic medicineSource codeSpeedupGeneral Computer ScienceComputer sciencemedia_common.quotation_subjectParallel computingSupercomputerTheoretical Computer Science03 medical and health sciences030104 developmental biology0302 clinical medicine030220 oncology & carcinogenesisModeling and SimulationComputer clusterScalabilityFuse (electrical)Node (circuits)Partitioned global address spacemedia_commonJournal of Computational Science
researchProduct

CUDA-enabled hierarchical ward clustering of protein structures based on the nearest neighbour chain algorithm

2015

Clustering of molecular systems according to their three-dimensional structure is an important step in many bioinformatics workflows. In applications such as docking or structure prediction, many algorithms initially generate large numbers of candidate poses (or decoys), which are then clustered to allow for subsequent computationally expensive evaluations of reasonable representatives. Since the number of such candidates can easily range from thousands to millions, performing the clustering on standard central processing units (CPUs) is highly time consuming. In this paper, we analyse and evaluate different approaches to parallelize the nearest neighbour chain algorithm to perform hierarc…

0301 basic medicineSpeedupComputer scienceCorrelation clusteringParallel computingTheoretical Computer Science03 medical and health sciencesCUDA030104 developmental biologyHardware and ArchitectureCluster analysisAlgorithmSoftwareWard's methodThe International Journal of High Performance Computing Applications
researchProduct

Simulation-based estimation of branching models for LTR retrotransposons

2017

Abstract Motivation LTR retrotransposons are mobile elements that are able, like retroviruses, to copy and move inside eukaryotic genomes. In the present work, we propose a branching model for studying the propagation of LTR retrotransposons in these genomes. This model allows us to take into account both the positions and the degradation level of LTR retrotransposons copies. In our model, the duplication rate is also allowed to vary with the degradation level. Results Various functions have been implemented in order to simulate their spread and visualization tools are proposed. Based on these simulation tools, we have developed a first method to evaluate the parameters of this propagation …

0301 basic medicineStatistics and ProbabilitySource codeTheoretical computer scienceRetroelementsmedia_common.quotation_subjectRetrotransposon[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE]BiologyBiochemistryGenomeChromosomesBranching (linguistics)[INFO.INFO-IU]Computer Science [cs]/Ubiquitous Computing03 medical and health sciences[INFO.INFO-CR]Computer Science [cs]/Cryptography and Security [cs.CR]SoftwareAnimalsComputer SimulationMolecular BiologyComputingMilieux_MISCELLANEOUSmedia_commoncomputer.programming_languageGeneticsGenomeModels Geneticbusiness.industry[SDV.BID.EVO]Life Sciences [q-bio]/Biodiversity/Populations and Evolution [q-bio.PE]Python (programming language)[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM][INFO.INFO-MO]Computer Science [cs]/Modeling and SimulationComputer Science ApplicationsVisualizationComputational Mathematics030104 developmental biologyDrosophila melanogasterComputational Theory and Mathematics[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA]Programming Languages[INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET]Mobile genetic elements[INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]businesscomputerSoftware
researchProduct