Search results for "Computer Science Applications"

showing 10 items of 3993 documents

The latent geometry of the human protein interaction network

2017

Abstract Motivation A series of recently introduced algorithms and models advocates for the existence of a hyperbolic geometry underlying the network representation of complex systems. Since the human protein interaction network (hPIN) has a complex architecture, we hypothesized that uncovering its latent geometry could ease challenging problems in systems biology, translating them into measuring distances between proteins. Results We embedded the hPIN to hyperbolic space and found that the inferred coordinates of nodes capture biologically relevant features, like protein age, function and cellular localization. This means that the representation of the hPIN in the two-dimensional hyperboli…

0301 basic medicineStatistics and ProbabilityGeometric analysisComputer scienceHyperbolic geometrySystems biologyComplex systemContext (language use)GeometryBiochemistryProtein–protein interaction03 medical and health sciencesInteraction networkHumansProtein Interaction MapsRepresentation (mathematics)Cluster analysisMolecular BiologySystems BiologyHyperbolic spaceProteinsFunction (mathematics)Original PapersComputer Science ApplicationsComputational Mathematics030104 developmental biologyComputational Theory and MathematicsEmbeddingSignal transductionAlgorithmsSignal Transduction

researchProduct

panISa: ab initio detection of insertion sequences in bacterial genomes from short read sequence data.

2018

Abstract Motivation The advent of next-generation sequencing has boosted the analysis of bacterial genome evolution. Insertion sequence (IS) elements play a key role in prokaryotic genome organization and evolution, but their repetitions in genomes complicate their detection from short-read data. Results PanISa is a software pipeline that identifies IS insertions ab initio in bacterial genomes from short-read data. It is a highly sensitive and precise tool based on the detection of read-mapping patterns at the insertion site. PanISa performs better than existing IS detection systems as it is based on a database-free approach. We applied it to a high-risk clone lineage of the pathogenic spec…

0301 basic medicineStatistics and ProbabilityLineage (genetic)Computer scienceAb initioComputational biologyBacterial genome size[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE]BiochemistryGenome[INFO.INFO-IU]Computer Science [cs]/Ubiquitous Computing03 medical and health sciences[INFO.INFO-CR]Computer Science [cs]/Cryptography and Security [cs.CR][SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry Molecular Biology/Genomics [q-bio.GN]Insertion sequenceMolecular BiologyGenomic organizationHigh-Throughput Nucleotide SequencingSequence Analysis DNA[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM][SDV.MP.BAC]Life Sciences [q-bio]/Microbiology and Parasitology/BacteriologyPipeline (software)[INFO.INFO-MO]Computer Science [cs]/Modeling and SimulationComputer Science ApplicationsComputational Mathematics030104 developmental biologyComputational Theory and Mathematics[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA]DNA Transposable Elements[INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET][INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]Genome BacterialSoftwareBioinformatics (Oxford, England)

researchProduct

The intrinsic combinatorial organization and information theoretic content of a sequence are correlated to the DNA encoded nucleosome organization of…

2015

Abstract Motivation: Thanks to research spanning nearly 30 years, two major models have emerged that account for nucleosome organization in chromatin: statistical and sequence specific. The first is based on elegant, easy to compute, closed-form mathematical formulas that make no assumptions of the physical and chemical properties of the underlying DNA sequence. Moreover, they need no training on the data for their computation. The latter is based on some sequence regularities but, as opposed to the statistical model, it lacks the same type of closed-form formulas that, in this case, should be based on the DNA sequence only. Results: We contribute to close this important methodological gap …

0301 basic medicineStatistics and ProbabilityNucleosome organizationComputational biologyBiologyType (model theory)BiochemistryGenomeDNA sequencing03 medical and health sciencesComputational Theory and MathematicNucleosomeMolecular BiologySequence (medicine)GeneticsGenomeSettore INF/01 - InformaticaEukaryotaComputer Science Applications1707 Computer Vision and Pattern RecognitionStatistical modelDNAChromatinNucleosomesComputer Science ApplicationsChromatinSettore BIO/18 - GeneticaComputational Mathematics030104 developmental biologyComputational Theory and MathematicsComputational MathematicBioinformatics

researchProduct

Reference genome assessment from a population scale perspective: an accurate profile of variability and noise.

2017

Abstract Motivation Current plant and animal genomic studies are often based on newly assembled genomes that have not been properly consolidated. In this scenario, misassembled regions can easily lead to false-positive findings. Despite quality control scores are included within genotyping protocols, they are usually employed to evaluate individual sample quality rather than reference sequence reliability. We propose a statistical model that combines quality control scores across samples in order to detect incongruent patterns at every genomic region. Our model is inherently robust since common artifact signals are expected to be shared between independent samples over misassembled regions …

0301 basic medicineStatistics and ProbabilityQuality ControlGenotypeComputer sciencemedia_common.quotation_subjectPopulationGenomicsBioinformaticscomputer.software_genreBiochemistryGenome03 medical and health sciencesGenetic variationAnimalsHumansQuality (business)AlleleeducationMolecular BiologyGenotypingReliability (statistics)media_commonProtocol (science)education.field_of_studyGenomeModels StatisticalGenetic VariationReproducibility of ResultsGenomicsGenome AnalysisOriginal PapersComputer Science ApplicationsComputational Mathematics030104 developmental biologyComputational Theory and MathematicsData miningcomputerSoftwareReference genome

researchProduct

dAPE: a web server to detect homorepeats and follow their evolution.

2016

Abstract Summary Homorepeats are low complexity regions consisting of repetitions of a single amino acid residue. There is no current consensus on the minimum number of residues needed to define a functional homorepeat, nor even if mismatches are allowed. Here we present dAPE, a web server that helps following the evolution of homorepeats based on orthology information, using a sensitive but tunable cutoff to help in the identification of emerging homorepeats. Availability and Implementation dAPE can be accessed from http://cbdm-01.zdv.uni-mainz.de/∼munoz/polyx. Supplementary information Supplementary data are available at Bioinformatics online.

0301 basic medicineStatistics and ProbabilityRepetitive Sequences Amino AcidWeb serverInternetComputer sciencecomputer.software_genreBiochemistryApplications NotesComputer Science ApplicationsWorld Wide WebEvolution Molecular03 medical and health sciencesComputational Mathematics030104 developmental biologyComputational Theory and MathematicsAnimalsHumansData miningMolecular BiologycomputerSequence AlignmentSequence AnalysisSoftwareBioinformatics (Oxford, England)

researchProduct

AFS: identification and quantification of species composition by metagenomic sequencing

2017

Abstract Summary DNA-based methods to detect and quantify taxon composition in biological materials are often based on species-specific polymerase chain reaction, limited to detecting species targeted by the assay. Next-generation sequencing overcomes this drawback by untargeted shotgun sequencing of whole metagenomes at affordable cost. Here we present AFS, a software pipeline for quantification of species composition in food. AFS uses metagenomic shotgun sequencing and sequence read counting to infer species proportions. Using Illumina data from a reference sausage comprising four species, we reveal that AFS is independent of the sequencing assay and library preparation protocol. Cost-sav…

0301 basic medicineStatistics and ProbabilitySequence analysisLibrary preparationComputational biologyBiologyBioinformaticsBiochemistrylaw.invention03 medical and health sciences0404 agricultural biotechnologylawMolecular BiologyPolymerase chain reactionShotgun sequencingHigh-Throughput Nucleotide SequencingSequence Analysis DNA04 agricultural and veterinary sciencesAccession number (bioinformatics)040401 food scienceBiological materialsComputer Science ApplicationsComputational Mathematics030104 developmental biologyComputational Theory and MathematicsMetagenomicsFood MicrobiologyIdentification (biology)MetagenomicsSoftwareBioinformatics

researchProduct

MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems

2016

This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of recordJorge González-Domínguez, Yongchao Liu, Juan Touriño, Bertil Schmidt; MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems, Bioinformatics, Volume 32, Issue 24, 15 December 2016, Pages 3826–3828, https://doi.org/10.1093/bioinformatics/btw558is available online at: https://doi.org/10.1093/bioinformatics/btw558 [Abstracts] MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-sca…

0301 basic medicineStatistics and ProbabilitySource codeComputer sciencemedia_common.quotation_subject02 engineering and technologyParallel computingcomputer.software_genreBiochemistryExecution time03 medical and health sciences0202 electrical engineering electronic engineering information engineeringCluster (physics)Point (geometry)Amino Acid SequenceMolecular Biologymedia_commonSequenceMultiple sequence alignmentProtein multiple sequenceComputational BiologyProteinsMarkov ChainsComputer Science ApplicationsComputational Mathematics030104 developmental biologyComputational Theory and MathematicsDistributed memory systemsMSAProbs020201 artificial intelligence & image processingMPIData miningSequence AlignmentcomputerAlgorithmsSoftware

researchProduct

Simulation-based estimation of branching models for LTR retrotransposons

2017

Abstract Motivation LTR retrotransposons are mobile elements that are able, like retroviruses, to copy and move inside eukaryotic genomes. In the present work, we propose a branching model for studying the propagation of LTR retrotransposons in these genomes. This model allows us to take into account both the positions and the degradation level of LTR retrotransposons copies. In our model, the duplication rate is also allowed to vary with the degradation level. Results Various functions have been implemented in order to simulate their spread and visualization tools are proposed. Based on these simulation tools, we have developed a first method to evaluate the parameters of this propagation …

0301 basic medicineStatistics and ProbabilitySource codeTheoretical computer scienceRetroelementsmedia_common.quotation_subjectRetrotransposon[INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE]BiologyBiochemistryGenomeChromosomesBranching (linguistics)[INFO.INFO-IU]Computer Science [cs]/Ubiquitous Computing03 medical and health sciences[INFO.INFO-CR]Computer Science [cs]/Cryptography and Security [cs.CR]SoftwareAnimalsComputer SimulationMolecular BiologyComputingMilieux_MISCELLANEOUSmedia_commoncomputer.programming_languageGeneticsGenomeModels Geneticbusiness.industry[SDV.BID.EVO]Life Sciences [q-bio]/Biodiversity/Populations and Evolution [q-bio.PE]Python (programming language)[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM][INFO.INFO-MO]Computer Science [cs]/Modeling and SimulationComputer Science ApplicationsVisualizationComputational Mathematics030104 developmental biologyDrosophila melanogasterComputational Theory and Mathematics[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA]Programming Languages[INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET]Mobile genetic elements[INFO.INFO-DC]Computer Science [cs]/Distributed Parallel and Cluster Computing [cs.DC]businesscomputerSoftware

researchProduct

REGGAE : a novel approach for the identification of key transcriptional regulators

2019

Abstract Motivation Transcriptional regulators play a major role in most biological processes. Alterations in their activities are associated with a variety of diseases and in particular with tumor development and progression. Hence, it is important to assess the effects of deregulated regulators on pathological processes. Results Here, we present REGulator-Gene Association Enrichment (REGGAE), a novel method for the identification of key transcriptional regulators that have a significant effect on the expression of a given set of genes, e.g. genes that are differentially expressed between two sample groups. REGGAE uses a Kolmogorov–Smirnov-like test statistic that implicitly combines assoc…

0301 basic medicineStatistics and ProbabilityTranscription Genetic610Computational biologyBiologyBiochemistry03 medical and health sciencesNeoplasmsHumansTwo sampleMolecular BiologyGeneProbabilitySupplementary dataRegulation of gene expressionSystems Biology500Original PapersComputer Science Applications004Computational Mathematics030104 developmental biologyComputational Theory and MathematicsGene Expression RegulationKey (cryptography)Identification (biology)FemaleSoftware

researchProduct

In vitro versus in vivo compositional landscapes of histone sequence preferences in eucaryotic genomes

2018

Abstract Motivation Although the nucleosome occupancy along a genome can be in part predicted by in vitro experiments, it has been recently observed that the chromatin organization presents important differences in vitro with respect to in vivo. Such differences mainly regard the hierarchical and regular structures of the nucleosome fiber, whose existence has long been assumed, and in part also observed in vitro, but that does not apparently occur in vivo. It is also well known that the DNA sequence has a role in determining the nucleosome occupancy. Therefore, an important issue is to understand if, and to what extent, the structural differences in the chromatin organization between in vit…

0301 basic medicineStatistics and Probabilityved/biology.organism_classification_rank.speciesComputational biologySaccharomyces cerevisiaeGenomeBiochemistryDNA sequencingHistones03 medical and health sciences0302 clinical medicineIn vivoComputational Theory and MathematicNucleosomeAnimalsModel organismCaenorhabditis elegansMolecular BiologySequence (medicine)GenomebiologySettore INF/01 - Informaticaved/biologyComputer Science ApplicationChromatinComputer Science ApplicationsChromatinNucleosomesComputational Mathematics030104 developmental biologyHistoneEukaryotic CellsComputational Theory and Mathematicsbiology.proteinComputer Vision and Pattern RecognitionSequence Analysis030217 neurology & neurosurgery

researchProduct