Search results for "Computer Science Applications"

showing 10 items of 3993 documents

Multi-omics HeCaToS dataset of repeated dose toxicity for cardiotoxic & hepatotoxic compounds.

2022

The data currently described was generated within the EU/FP7 HeCaToS project (Hepatic and Cardiac Toxicity Systems modeling). The project aimed to develop an in silico prediction system to contribute to drug safety assessment for humans. For this purpose, multi-omics data of repeated dose toxicity were obtained for 10 hepatotoxic and 10 cardiotoxic compounds. Most data were gained from in vitro experiments in which 3D microtissues (either hepatic or cardiac) were exposed to a therapeutic (physiologically relevant concentrations calculated through PBPK-modeling) or a toxic dosing profile (IC20 after 7 days). Exposures lasted for 14 days and samples were obtained at 7 time points (therapeutic…

Statistics and ProbabilityEpigenomicsProteomicsBioquímicaBiologiaDrug-Related Side Effects and Adverse ReactionsLibrary and Information SciencesCardiotoxicityComputer Science ApplicationsEducationHumansMetabolomicsStatistics Probability and UncertaintyTranscriptomeInformation Systems

researchProduct

Adaptive reference-free compression of sequence quality scores

2014

Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing the vast datasets that are now routinely produced. Relatively little attention has been paid to compressing the quality scores that are assigned to each sequence, even though these scores may be harder to compress than the sequences themselves. By aggregating a set of reads into a compressed index, we find that the majority of bases can be predicted from the sequence of bases that are adjacent to them and hence are likely to be less informative for variant calling or other applications. The quality scores for such bases are aggressively compressed, leaving a relatively small number at full reso…

Statistics and ProbabilityFOS: Computer and information sciencesComputer sciencemedia_common.quotation_subjectReference-freecomputer.software_genreBiochemistryDNA sequencingSet (abstract data type)Redundancy (information theory)BWTComputer Science - Data Structures and AlgorithmsCode (cryptography)AnimalsHumansQuality (business)Data Structures and Algorithms (cs.DS)Quantitative Biology - GenomicsCaenorhabditis elegansMolecular Biologymedia_commonGenomics (q-bio.GN)SequenceGenomeSettore INF/01 - Informaticareference-free compressionHigh-Throughput Nucleotide SequencingGenomicsSequence Analysis DNAData CompressioncompressionComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsFOS: Biological sciencesData miningquality scoreMetagenomicscomputerBWT; compression; quality score; reference-free compressionAlgorithmsReference genome

researchProduct

Metagenomics reveals our incomplete knowledge of global diversity

2008

Metagenomic sequencing obtains huge amounts of sequences from environmental and clinical samples, thus providing a glimpse of the global prokaryotic diversity of both species and genes in these sources. The current trend in metagenomic analysis follows the so-called gene-centric approach, focused on describing the environments by the study of the functional roles of the proteins encoded in the sequenced genes. In this way, it is clear that metagenomic analysis relies heavily on the accurate knowledge of the universe of proteins stored in the databases. Nevertheless, it is known that some biases exist in the composition of databases (which are rich in sequences from common, cultivable and ea…

Statistics and ProbabilityGeneticsPhylogenetic treebiologyPhylumGenetic VariationGenomicsBiodiversityGenomicsGenome Analysisbiology.organism_classificationBiochemistryComputer Science ApplicationsComputational MathematicsTaxonComputational Theory and MathematicsEvolutionary biologyMetagenomicsGenBankCIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIALTaxonomic rankLetter to the EditorMolecular BiologyEcosystemAcidobacteria

researchProduct

SeqEditor: an application for primer design and sequence analysis with or without GTF/GFF files

2021

[Motivation]: Sequence analyses oriented to investigate specific features, patterns and functions of protein and DNA/RNA sequences usually require tools based on graphic interfaces whose main characteristic is their intuitiveness and interactivity with the user’s expertise, especially when curation or primer design tasks are required. However, interface-based tools usually pose certain computational limitations when managing large sequences or complex datasets, such as genome and transcriptome assemblies. Having these requirments in mind we have developed SeqEditor an interactive software tool for nucleotide and protein sequences’ analysis.

Statistics and ProbabilityInterface (Java)Sequence analysisComputer sciencePcr assayBiochemistryGenomeTranscriptome03 medical and health sciencesSequence Analysis ProteinMultiplex polymerase chain reactionHumansNucleotideAmino Acid SequenceMolecular Biology030304 developmental biologychemistry.chemical_classification0303 health sciencesGenomeInformation retrievalContig030302 biochemistry & molecular biologyChromosomeComputer Science ApplicationsComputational MathematicsComputingMethodologies_PATTERNRECOGNITIONComputational Theory and MathematicschemistryLine (text file)Primer (molecular biology)Sequence AnalysisSoftwareReference genome

researchProduct

Sparse kernel methods for high-dimensional survival data

2008

Abstract Sparse kernel methods like support vector machines (SVM) have been applied with great success to classification and (standard) regression settings. Existing support vector classification and regression techniques however are not suitable for partly censored survival data, which are typically analysed using Cox's proportional hazards model. As the partial likelihood of the proportional hazards model only depends on the covariates through inner products, it can be ‘kernelized’. The kernelized proportional hazards model however yields a solution that is dense, i.e. the solution depends on all observations. One of the key features of an SVM is that it yields a sparse solution, dependin…

Statistics and ProbabilityLung NeoplasmsLymphomaComputer sciencecomputer.software_genreComputing MethodologiesBiochemistryPattern Recognition AutomatedArtificial IntelligenceMargin (machine learning)CovariateCluster AnalysisHumansComputer SimulationFraction (mathematics)Molecular BiologyProportional Hazards ModelsModels StatisticalTraining setProportional hazards modelGene Expression ProfilingComputational BiologyComputer Science ApplicationsSupport vector machineComputational MathematicsKernel methodComputational Theory and MathematicsRegression AnalysisData miningcomputerAlgorithmsSoftwareBioinformatics

researchProduct

Pseudo-Cut Strategies for Global Optimization

2011

Motivated by the successful use of a pseudo-cut strategy within the setting of constrained nonlinear and nonconvex optimization in Lasdon et al. (2010), we propose a framework for general pseudo-cut strategies in global optimization that provides a broader and more comprehensive range of methods. The fundamental idea is to introduce linear cutting planes that provide temporary, possibly invalid, restrictions on the space of feasible solutions, as proposed in the setting of the tabu search metaheuristic in Glover (1989), in order to guide a solution process toward a global optimum, where the cutting planes can be discarded and replaced by others as the process continues. These strategies can…

Statistics and ProbabilityMathematical optimizationControl and OptimizationProcess (engineering)Space (commercial competition)Tabu searchComputer Science ApplicationsComputational MathematicsNonlinear systemRange (mathematics)Computational Theory and MathematicsOrder (exchange)Modeling and SimulationDecision Sciences (miscellaneous)Global optimizationMetaheuristicMathematicsInternational Journal of Applied Metaheuristic Computing

researchProduct

Comprehensive estimation of input signals and dynamics in biochemical reaction networks

2012

Abstract Motivation: Cellular information processing can be described mathematically using differential equations. Often, external stimulation of cells by compounds such as drugs or hormones leading to activation has to be considered. Mathematically, the stimulus is represented by a time-dependent input function. Parameters such as rate constants of the molecular interactions are often unknown and need to be estimated from experimental data, e.g. by maximum likelihood estimation. For this purpose, the input function has to be defined for all times of the integration interval. This is usually achieved by approximating the input by interpolation or smoothing of the measured data. This procedu…

Statistics and ProbabilityMedicin och hälsovetenskapComputer scienceDifferential equationMaximum likelihoodcomputer.software_genreBiochemistryModels BiologicalMedical and Health SciencesIntegration intervalMolecular BiologyJanus KinasesLikelihood FunctionsRegulation Pathways and Systems BiologyExperimental dataOriginal PapersConfidence intervalComputer Science ApplicationsComputational MathematicsSTAT Transcription FactorsComputational Theory and MathematicsData miningAlgorithmcomputerSmoothingAlgorithmsSignal Transduction

researchProduct

ballaxy: web services for structural bioinformatics.

2014

Abstract Motivation: Web-based workflow systems have gained considerable momentum in sequence-oriented bioinformatics. In structural bioinformatics, however, such systems are still relatively rare; while commercial stand-alone workflow applications are common in the pharmaceutical industry, academic researchers often still rely on command-line scripting to glue individual tools together. Results: In this work, we address the problem of building a web-based system for workflows in structural bioinformatics. For the underlying molecular modelling engine, we opted for the BALL framework because of its extensive and well-tested functionality in the field of structural bioinformatics. The large …

Statistics and ProbabilityModels MolecularComputer sciencecomputer.software_genreBiochemistryWorkflowStructural bioinformaticsUser-Computer InterfaceHumansMolecular Biologybusiness.industryComputational BiologySequence Analysis DNAData structureComputer Science ApplicationsVisualizationSystems IntegrationComputational MathematicsWorkflowComputational Theory and MathematicsScripting languageWeb serviceSoftware engineeringbusinesscomputerAlgorithmsSoftwareBioinformatics (Oxford, England)

researchProduct

Assessment of the probabilities for evolutionary structural changes in protein folds.

2007

Abstract Motivation: The evolution of protein sequences can be described by a stepwise process, where each step involves changes of a few amino acids. In a similar manner, the evolution of protein folds can be at least partially described by an analogous process, where each step involves comparatively simple changes affecting few secondary structure elements. A number of such evolution steps, justified by biologically confirmed examples, have previously been proposed by other researchers. However, unlike the situation with sequences, as far as we know there have been no attempts to estimate the comparative probabilities for different kinds of such structural changes. Results: We have tried …

Statistics and ProbabilityModels MolecularProtein FoldingProtein domainStructural alignmentBiologyBiochemistrySet (abstract data type)Evolution MolecularProtein structureSimilarity (network science)Sequence Analysis ProteinComputer SimulationMolecular BiologyProtein secondary structureConserved SequenceSequenceModels GeneticSequence Homology Amino AcidProteinsStructural Classification of Proteins databaseComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsModels ChemicalData Interpretation Statisticalsense organsAlgorithmSequence AlignmentBioinformatics (Oxford, England)

researchProduct

CARE: context-aware sequencing read error correction.

2020

Abstract Motivation Error correction is a fundamental pre-processing step in many Next-Generation Sequencing (NGS) pipelines, in particular for de novo genome assembly. However, existing error correction methods either suffer from high false-positive rates since they break reads into independent k-mers or do not scale efficiently to large amounts of sequencing reads and complex genomes. Results We present CARE—an alignment-based scalable error correction algorithm for Illumina data using the concept of minhashing. Minhashing allows for efficient similarity search within large sequencing read collections which enables fast computation of high-quality multiple alignments. Sequencing errors ar…

Statistics and ProbabilityMultiple sequence alignmentComputer scienceSequence assemblyHigh-Throughput Nucleotide SequencingContext (language use)Sequence Analysis DNAcomputer.software_genreBiochemistryGenomeComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsHumansHuman genomeData miningError detection and correctionMolecular BiologycomputerSequence AlignmentAlgorithmsSoftwareBioinformatics (Oxford, England)

researchProduct