Search results for "Theoretical Computer Science"

showing 10 items of 1151 documents

An effective extension of the applicability of alignment-free biological sequence comparison algorithms with Hadoop

2016

Alignment-free methods are one of the mainstays of biological sequence comparison, i.e., the assessment of how similar two biological sequences are to each other, a fundamental and routine task in computational biology and bioinformatics. They have gained popularity since, even on standard desktop machines, they are faster than methods based on alignments. However, with the advent of Next-Generation Sequencing Technologies, datasets whose size, i.e., number of sequences and their total length, is a challenge to the execution of alignment-free methods on those standard machines are quite common. Here, we propose the first paradigm for the computation of k-mer-based alignment-free methods for…

0301 basic medicineTheoretical computer science030102 biochemistry & molecular biologySettore INF/01 - InformaticaComputer scienceComputationExtension (predicate logic)Information SystemHash tableDistributed computingTask (project management)Theoretical Computer Science03 medical and health sciences030104 developmental biologyAlignment-free sequence comparison and analysisHadoopHardware and Architecturealignment-free sequence comparison and analysis; distributed computing; Hadoop; MapReduce; software; theoretical computer science; information systems; hardware and architectureSequence comparisonMapReduceAlignment-free sequence comparison and analysiAlignment-free sequence comparison and analysis; Distributed computing; Hadoop; MapReduce; Theoretical Computer Science; Software; Information Systems; Hardware and ArchitectureSoftwareInformation Systems
researchProduct

Parallel and Space-Efficient Construction of Burrows-Wheeler Transform and Suffix Array for Big Genome Data

2016

Next-generation sequencing technologies have led to the sequencing of more and more genomes, propelling related research into the era of big data. In this paper, we present ParaBWT, a parallelized Burrows-Wheeler transform (BWT) and suffix array construction algorithm for big genome data. In ParaBWT, we have investigated a progressive construction approach to constructing the BWT of single genome sequences in linear space complexity, but with a small constant factor. This approach has been further parallelized using multi-threading based on a master-slave coprocessing model. After gaining the BWT, the suffix array is constructed in a memory-efficient manner. The performance of ParaBWT has b…

0301 basic medicineTheoretical computer scienceBurrows–Wheeler transformComputer scienceGenomicsData_CODINGANDINFORMATIONTHEORYParallel computingGenomelaw.invention03 medical and health scienceslawGeneticsHumansEnsemblMulti-core processorApplied MathematicsLinear spaceSuffix arrayChromosome MappingHigh-Throughput Nucleotide SequencingGenomicsSequence Analysis DNA030104 developmental biologyAlgorithmsBiotechnologyReference genomeIEEE/ACM Transactions on Computational Biology and Bioinformatics
researchProduct

QuBiLS-MAS, open source multi-platform software for atom- and bond-based topological (2D) and chiral (2.5D) algebraic molecular descriptors computati…

2017

Background In previous reports, Marrero-Ponce et al. proposed algebraic formalisms for characterizing topological (2D) and chiral (2.5D) molecular features through atom- and bond-based ToMoCoMD-CARDD (acronym for Topological Molecular Computational Design-Computer Aided Rational Drug Design) molecular descriptors. These MDs codify molecular information based on the bilinear, quadratic and linear algebraic forms and the graph-theoretical electronic-density and edge-adjacency matrices in order to consider atom- and bond-based relations, respectively. These MDs have been successfully applied in the screening of chemical compounds of different therapeutic applications ranging from antimalarials…

0301 basic medicineTheoretical computer scienceComputer scienceBilinear interpolationLibrary and Information SciencesTopologyLinear01 natural scienceslcsh:ChemistryToMoCoMD-CARDDDouble stochastic03 medical and health sciencesMatrix (mathematics)SoftwareQuadratic equationMolecular descriptorAtom/bond-based molecular descriptorPhysical and Theoretical ChemistryAlgebraic numberSimple stochasticFree and open source softwarelcsh:T58.5-58.64lcsh:Information technologybusiness.industryQSARMutual probability matricesComputer Graphics and Computer-Aided DesignRotation formalisms in three dimensions0104 chemical sciencesComputer Science Applications010404 medicinal & biomolecular chemistry030104 developmental biologylcsh:QD1-999CheminformaticsBilinear and quadratic indicesbusinessNon-stochasticSoftwareQuBiLS-MASJournal of cheminformatics
researchProduct

Identification of control targets in Boolean molecular network models via computational algebra

2015

Motivation: Many problems in biomedicine and other areas of the life sciences can be characterized as control problems, with the goal of finding strategies to change a disease or otherwise undesirable state of a biological system into another, more desirable, state through an intervention, such as a drug or other therapeutic treatment. The identification of such strategies is typically based on a mathematical model of the process to be altered through targeted control inputs. This paper focuses on processes at the molecular level that determine the state of an individual cell, involving signaling or gene regulation. The mathematical model type considered is that of Boolean networks. The pot…

0301 basic medicineTheoretical computer scienceComputer scienceProcess (engineering)Molecular Networks (q-bio.MN)Systems biologySystem of polynomial equationsENCODEBoolean networksSet (abstract data type)03 medical and health sciences0302 clinical medicineStructural BiologyModelling and SimulationQuantitative Biology - Molecular NetworksMolecular BiologyEdge deletionsApplied MathematicsComputer Science ApplicationsNetwork controlIdentification (information)030104 developmental biologyBoolean networkBlocking transitionsFOS: Biological sciencesModeling and SimulationAlgebraic controlState (computer science)030217 neurology & neurosurgeryResearch ArticleBMC Systems Biology
researchProduct

Ultra-Fast Detection of Higher-Order Epistatic Interactions on GPUs

2017

Detecting higher-order epistatic interactions in Genome-Wide Association Studies (GWAS) remains a challenging task in the fields of genetic epidemiology and computer science. A number of algorithms have recently been proposed for epistasis discovery. However, they suffer from a high computational cost since statistical measures have to be evaluated for each possible combination of markers. Hence, many algorithms use additional filtering stages discarding potentially non-interacting markers in order to reduce the overall number of combinations to be examined. Among others, Mutual Information Clustering (MIC) is a common pre-processing filter for grouping markers into partitions using K-Means…

0301 basic medicineTheoretical computer scienceComputer sciencebusiness.industryContrast (statistics)Genome-wide association study02 engineering and technologyMutual informationMachine learningcomputer.software_genreReduction (complexity)03 medical and health sciences030104 developmental biologyGenetic epidemiology0202 electrical engineering electronic engineering information engineeringEpistasis020201 artificial intelligence & image processingArtificial intelligenceCluster analysisbusinesscomputerGenetic association
researchProduct

A detailed experimental study of a DNA computer with two endonucleases

2017

Abstract Great advances in biotechnology have allowed the construction of a computer from DNA. One of the proposed solutions is a biomolecular finite automaton, a simple two-state DNA computer without memory, which was presented by Ehud Shapiro’s group at the Weizmann Institute of Science. The main problem with this computer, in which biomolecules carry out logical operations, is its complexity – increasing the number of states of biomolecular automata. In this study, we constructed (in laboratory conditions) a six-state DNA computer that uses two endonucleases (e.g. AcuI and BbvI) and a ligase. We have presented a detailed experimental verification of its feasibility. We described the effe…

0301 basic medicineTheoretical computer scienceDNA LigasesComputer scienceCarry (arithmetic)Oligonucleotides0102 computer and information sciencesBioinformatics01 natural sciencesGeneral Biochemistry Genetics and Molecular Biologylaw.inventionAutomationComputers Molecular03 medical and health sciencesDNA computinglawA-DNADeoxyribonucleases Type II Site-Specificchemistry.chemical_classificationDNA ligaseFinite-state machineBase Sequencebiomolecular computers; DNA computing; finite automataProcess (computing)DNAModels TheoreticalEndonucleasesAutomaton030104 developmental biologychemistry010201 computation theory & mathematicsWord (computer architecture)Zeitschrift für Naturforschung C
researchProduct

Biomolecular computers with multiple restriction enzymes

2017

Abstract The development of conventional, silicon-based computers has several limitations, including some related to the Heisenberg uncertainty principle and the von Neumann “bottleneck”. Biomolecular computers based on DNA and proteins are largely free of these disadvantages and, along with quantum computers, are reasonable alternatives to their conventional counterparts in some applications. The idea of a DNA computer proposed by Ehud Shapiro’s group at the Weizmann Institute of Science was developed using one restriction enzyme as hardware and DNA fragments (the transition molecules) as software and input/output signals. This computer represented a two-state two-symbol finite automaton t…

0301 basic medicineTheoretical computer scienceDNA computerlcsh:QH426-4700102 computer and information sciencesBiology01 natural scienceslaw.inventionrestriction enzymesGenomics and Bioinformatics03 medical and health sciencessymbols.namesakeSoftwareDNA computinglawGeneticsNondeterministic finite automatonMolecular BiologyQuantum computerFinite-state machinebusiness.industryConstruct (python library)bioinformaticsDNARestriction enzymelcsh:Genetics030104 developmental biology010201 computation theory & mathematicssymbolsbusinessVon Neumann architectureGenetics and Molecular Biology
researchProduct

Accelerating metagenomic read classification on CUDA-enabled GPUs.

2016

Metagenomic sequencing studies are becoming increasingly popular with prominent examples including the sequencing of human microbiomes and diverse environments. A fundamental computational problem in this context is read classification; i.e. the assignment of each read to a taxonomic label. Due to the large number of reads produced by modern high-throughput sequencing technologies and the rapidly increasing number of available reference genomes software tools for fast and accurate metagenomic read classification are urgently needed. We present cuCLARK, a read-level classifier for CUDA-enabled GPUs, based on the fast and accurate classification of metagenomic sequences using reduced k-mers (…

0301 basic medicineTheoretical computer scienceWorkstationGPUsComputer scienceContext (language use)CUDAParallel computingBiochemistryGenomelaw.invention03 medical and health sciencesCUDAUser-Computer Interface0302 clinical medicineStructural BiologylawTaxonomic assignmentHumansMicrobiomeMolecular BiologyInternetXeonApplied MathematicsHigh-Throughput Nucleotide SequencingSequence Analysis DNAExact k-mer matchingComputer Science Applications030104 developmental biologyTitan (supercomputer)Metagenomics030220 oncology & carcinogenesisMetagenomicsDNA microarraySoftwareBMC bioinformatics
researchProduct

An Integrative Framework for the Construction of Big Functional Networks

2018

We present a methodology for biological data integration, aiming at building and analysing large functional networks which model complex genotype-phenotype associations. A functional network is a graph where nodes represent cellular components (e.g., genes, proteins, mRNA, etc.) and edges represent associations among such molecules. Different types of components may cohesist in the same network, and associations may be related to physical[biochemical interactions or functional/phenotipic relationships. Due to both the large amount of involved information and the computational complexity typical of the problems in this domain, the proposed framework is based on big data technologies (Spark a…

0301 basic medicinebiological networkBiological dataTheoretical computer scienceSettore INF/01 - InformaticaComputational complexity theoryComputer sciencebusiness.industryBig dataNoSQLcomputer.software_genreFunctional networks03 medical and health sciences030104 developmental biologyGraph (abstract data type)big data technologiesbig data technologiebusinesscomputerIntegrative approacheBiological network2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
researchProduct

DNA combinatorial messages and Epigenomics: The case of chromatin organization and nucleosome occupancy in eukaryotic genomes

2019

Abstract Epigenomics is the study of modifications on the genetic material of a cell that do not depend on changes in the DNA sequence, since those latter involve specific proteins around which DNA wraps. The end result is that Epigenomic changes have a fundamental role in the proper working of each cell in Eukaryotic organisms. A particularly important part of Epigenomics concentrates on the study of chromatin, that is, a fiber composed of a DNA-protein complex and very characterizing of Eukaryotes. Understanding how chromatin is assembled and how it changes is fundamental for Biology. In more than thirty years of research in this area, Mathematics and Theoretical Computer Science have gai…

0303 health sciencesSettore INF/01 - InformaticaGeneral Computer ScienceFiber (mathematics)0102 computer and information sciencesComputational biology01 natural sciencesNucleosome occupancyGenomeDNA sequencingTheoretical Computer ScienceChromatinComputational biology03 medical and health scienceschemistry.chemical_compoundchemistry010201 computation theory & mathematicsComputer ScienceAlgorithms and complexityFormal languageA fibersDNACombinatorics on word030304 developmental biologyEpigenomicsTheoretical Computer Science
researchProduct