Search results for "Computer Science Applications"

showing 10 items of 3993 documents

A parallel and sensitive software tool for methylation analysis on multicore platforms.

2015

Abstract Motivation: DNA methylation analysis suffers from very long processing time, as the advent of Next-Generation Sequencers has shifted the bottleneck of genomic studies from the sequencers that obtain the DNA samples to the software that performs the analysis of these samples. The existing software for methylation analysis does not seem to scale efficiently neither with the size of the dataset nor with the length of the reads to be analyzed. As it is expected that the sequencers will provide longer and longer reads in the near future, efficient and scalable methylation software should be developed. Results: We present a new software tool, called HPG-Methyl, which efficiently maps bis…

Statistics and ProbabilityMutation rateTime FactorsComputer scienceReal-time computingBisulfite sequencingMolecular Sequence DataGenomicsParallel computingcomputer.software_genremedicine.disease_causeBiochemistryGenomeBottleneckchemistry.chemical_compoundSoftwareMutation RateDatabases GeneticmedicineHumansSulfitesMolecular BiologyMutationMulti-core processorGenomeBase Sequencebusiness.industryHigh-Throughput Nucleotide SequencingMethylationGenomicsDNA MethylationOriginal PapersComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicschemistryDNA methylationScalabilityMutationCompilerbusinesscomputerSequence AnalysisDNAAlgorithmsSoftwareBioinformatics (Oxford, England)
researchProduct

Spatial data of Ixodes ricinus instar abundance and nymph pathogen prevalence, Scandinavia, 2016-2017.

2020

ticks carry pathogens that can cause disease in both animals and humans, and there is a need to monitor the distribution and abundance of ticks and the pathogens they carry to pinpoint potential high risk areas for tick-borne disease transmission. In a joint Scandinavian study, we measured Ixodes ricinus instar abundance at 159 sites in southern Scandinavia in August-September, 2016, and collected 29,440 tick nymphs at 50 of these sites. We additionally measured abundance at 30 sites in August-September, 2017. We tested the 29,440 tick nymphs in pools of 10 in a Fluidigm real-time PCR chip to screen for 17 different tick-associated pathogens, 2 pathogen groups and 3 tick species. We present…

Statistics and ProbabilityNymphIxodes ricinus030231 tropical medicineZoologyLibrary and Information SciencesTickScandinavian and Nordic CountriesEducation03 medical and health sciences0302 clinical medicineAbundance (ecology)parasitic diseasesAnimalsNymphlcsh:ScienceAuthor CorrectionPathogenEcosystemEcological epidemiology0303 health sciencesEcologybiologyIxodes030306 microbiologybiology.organism_classificationComputer Science ApplicationsHabitatInstarlcsh:QStatistics Probability and UncertaintyBacterial infectionDisease transmissionEntomologyAnimal DistributionInformation SystemsVDP::Matematikk og Naturvitenskap: 400::Zoologiske og botaniske fag: 480Scientific data
researchProduct

A non-linear optimization procedure to estimate distances and instantaneous substitution rate matrices under the GTR model.

2006

Abstract Motivation: The general-time-reversible (GTR) model is one of the most popular models of nucleotide substitution because it constitutes a good trade-off between mathematical tractability and biological reality. However, when it is applied for inferring evolutionary distances and/or instantaneous rate matrices, the GTR model seems more prone to inapplicability than more restrictive time-reversible models. Although it has been previously noted that the causes for intractability are caused by the impossibility of computing the logarithm of a matrix characterised by negative eigenvalues, the issue has not been investigated further. Results: Here, we formally characterize the mathematic…

Statistics and ProbabilityOptimization problemBase Pair MismatchBiochemistryLinkage DisequilibriumNonlinear programmingInterpretation (model theory)Evolution MolecularApplied mathematicsComputer SimulationDivergence (statistics)Molecular BiologyEigenvalues and eigenvectorsPhylogenyMathematicsSequenceModels GeneticSubstitution (logic)Chromosome MappingGenetic VariationSequence Analysis DNAComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsNonlinear DynamicsLogarithm of a matrixAlgorithmAlgorithmsBioinformatics (Oxford, England)
researchProduct

Iterative Cluster Analysis of Protein Interaction Data

2004

Abstract Motivation: Generation of fast tools of hierarchical clustering to be applied when distances among elements of a set are constrained, causing frequent distance ties, as happens in protein interaction data. Results: We present in this work the program UVCLUSTER, that iteratively explores distance datasets using hierarchical clustering. Once the user selects a group of proteins, UVCLUSTER converts the set of primary distances among them (i.e. the minimum number of steps, or interactions, required to connect two proteins) into secondary distances that measure the strength of the connection between each pair of proteins when the interactions for all the proteins in the group are consid…

Statistics and ProbabilitySaccharomyces cerevisiae ProteinsComputer sciencecomputer.software_genreBiochemistryInteractomePattern Recognition AutomatedSet (abstract data type)Protein Interaction MappingCluster (physics)Cluster AnalysisCluster analysisMolecular BiologyCytoskeletonMeasure (data warehouse)Gene Expression ProfilingProteinsActinsComputer Science ApplicationsHierarchical clusteringGene expression profilingComputational MathematicsComputational Theory and MathematicsPattern recognition (psychology)Benchmark (computing)Data miningcomputerAlgorithmsSoftwareSignal TransductionBioinformatics
researchProduct

A web application for the unspecific detection of differentially expressed DNA regions in strand-specific expression data

2015

Abstract Genomic technologies allow laboratories to produce large-scale data sets, either through the use of next-generation sequencing or microarray platforms. To explore these data sets and obtain maximum value from the data, researchers view their results alongside all the known features of a given reference genome. To study transcriptional changes that occur under a given condition, researchers search for regions of the genome that are differentially expressed between different experimental conditions. In order to identify these regions several algorithms have been developed over the years, along with some bioinformatic platforms that enable their use. However, currently available appli…

Statistics and ProbabilitySequence analysisADNGenomicsComputational biologyBiologycomputer.software_genreBiochemistryGenomeComputer GraphicsExpressió genèticaWeb applicationHumansMolecular BiologyGeneInternetMicroarray analysis techniquesbusiness.industryGenome HumanGene Expression ProfilingComputational BiologyHigh-Throughput Nucleotide SequencingDNAGenomicsSequence Analysis DNAComputer Science ApplicationsGene expression profilingComputational MathematicsGenòmicaComputingMethodologies_PATTERNRECOGNITIONComputational Theory and MathematicsData miningbusinesscomputerAlgorithmsGenèticaReference genome
researchProduct

Multiple sequence editing by spreadsheet.

1990

Spreadsheets have several functions and facilities that make them good candidates to be used as multiple sequence editors. They can be easily programmed (even by non-programmers) with macros that allow them to fit the needs of the user, free of the restrictions that programs written by other people have. Here I present a sheet containing a set of macros written for Lotus 1-2-3

Statistics and ProbabilitySequenceBase SequenceProgramming languagebusiness.industryComputer sciencecomputer.software_genreBiochemistryComputer Science ApplicationsSet (abstract data type)Computational MathematicsSoftwareComputational Theory and MathematicsSoftware DesignMicrocomputerNucleic AcidsSoftware designMacrobusinessMolecular BiologycomputerAlgorithmSoftwareComputer applications in the biosciences : CABIOS
researchProduct

The Power of Word-Frequency Based Alignment-Free Functions: a Comprehensive Large-Scale Experimental Analysis

2021

Abstract Motivation Alignment-free (AF) distance/similarity functions are a key tool for sequence analysis. Experimental studies on real datasets abound and, to some extent, there are also studies regarding their control of false positive rate (Type I error). However, assessment of their power, i.e. their ability to identify true similarity, has been limited to some members of the D2 family. The corresponding experimental studies have concentrated on short sequences, a scenario no longer adequate for current applications, where sequence lengths may vary considerably. Such a State of the Art is methodologically problematic, since information regarding a key feature such as power is either mi…

Statistics and ProbabilitySequenceSimilarity (geometry)Settore INF/01 - Informaticasequence analysisComputer sciencepower statisticsAlignment-Free Genomic Analysis Big Data Software Platforms Bioinformatics AlgorithmsScale (descriptive set theory)Function (mathematics)computer.software_genreBiochemistryComputer Science ApplicationsSet (abstract data type)Computational MathematicsRange (mathematics)Computational Theory and Mathematicssequence analysis; power statistics; alignment-free functionsalignment-free functionsData miningCompleteness (statistics)Molecular BiologycomputerType I and type II errors
researchProduct

Long read alignment based on maximal exact match seeds

2012

Abstract Motivation: The explosive growth of next-generation sequencing datasets poses a challenge to the mapping of reads to reference genomes in terms of alignment quality and execution speed. With the continuing progress of high-throughput sequencing technologies, read length is constantly increasing and many existing aligners are becoming inefficient as generated reads grow larger. Results: We present CUSHAW2, a parallelized, accurate, and memory-efficient long read aligner. Our aligner is based on the seed-and-extend approach and uses maximal exact matches as seeds to find gapped alignments. We have evaluated and compared CUSHAW2 to the three other long read aligners BWA-SW, Bowtie2 an…

Statistics and ProbabilitySequencing and Sequence AnalysisTheoretical computer scienceGenomicsBiologyBiochemistrySoftwareHumansMolecular BiologyAlignment-free sequence analysisExact matchSupplementary dataGenome Humanbusiness.industryChromosome MappingHigh-Throughput Nucleotide SequencingGenomicsSequence Analysis DNAOriginal PapersComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsComputer engineeringScalabilitybusinessSequence AlignmentAlgorithmsSoftwareBioinformatics
researchProduct

DRUDIT: Web-based DRUgs DIscovery Tools to design small molecules as modulators of biological targets

2019

Abstract Motivation New in silico tools to predict biological affinities for input structures are presented. The tools are implemented in the DRUDIT (DRUgs DIscovery Tools) web service. The DRUDIT biological finder module is based on molecular descriptors that are calculated by the MOLDESTO (MOLecular DEScriptors TOol) software module developed by the same authors, which is able to calculate more than one thousand molecular descriptors. At this stage, DRUDIT includes 250 biological targets, but new external targets can be added. This feature extends the application scope of DRUDIT to several fields. Moreover, two more functions are implemented: the multi- and on/off-target tasks. These tool…

Statistics and ProbabilityService (systems architecture)PolypharmacologyComputer scienceIn silicoMachine learningcomputer.software_genre01 natural sciencesBiochemistrybiological target finderdrug discoveryMolecular descriptors03 medical and health sciencesMolecular descriptorSettore BIO/10 - BiochimicaWeb applicationComputer SimulationPolypharmacologyMolecular Biology030304 developmental biologySettore ING-INF/05 - Sistemi Di Elaborazione Delle InformazioniInternet0303 health sciencesbusiness.industrySmall moleculeSettore CHIM/08 - Chimica Farmaceutica0104 chemical sciencesComputer Science Applications010404 medicinal & biomolecular chemistryComputational MathematicsComputational Theory and MathematicsBiological targetThe InternetArtificial intelligencebusinesscomputerSoftware
researchProduct

Overlap and diversity in antimicrobial peptide databases: Compiling a non-redundant set of sequences

2015

Abstract Motivation: The large variety of antimicrobial peptide (AMP) databases developed to date are characterized by a substantial overlap of data and similarity of sequences. Our goals are to analyze the levels of redundancy for all available AMP databases and use this information to build a new non-redundant sequence database. For this purpose, a new software tool is introduced. Results: A comparative study of 25 AMP databases reveals the overlap and diversity among them and the internal diversity within each database. The overlap analysis shows that only one database (Peptaibol) contains exclusive data, not present in any other, whereas all sequences in the LAMP_Patent database are inc…

Statistics and ProbabilitySimilarity (geometry)Computer scienceSequence analysisAntimicrobial peptidesPeptaibolPeptidecomputer.software_genreProceduresBiochemistrySet (abstract data type)chemistry.chemical_compoundProtein methodsSequence Analysis ProteinRedundancy (engineering)HumansDatabases ProteinMolecular BiologyAntimicrobial cationic peptideschemistry.chemical_classificationSequenceAntimicrobial cationic peptideDatabaseSequence databaseSequence analysisComputer Science ApplicationsAlgorithmComputational MathematicsChemistryProtein databaseComputational Theory and MathematicschemistryData miningNucleic acid databaseDatabases Nucleic AcidcomputerSoftwareAlgorithmsHuman
researchProduct