Search results for "Algorithms"

showing 10 items of 1716 documents

Pathway analysis of high-throughput biological data within a Bayesian network framework

2011

Abstract Motivation: Most current approaches to high-throughput biological data (HTBD) analysis either perform individual gene/protein analysis or, gene/protein set enrichment analysis for a list of biologically relevant molecules. Bayesian Networks (BNs) capture linear and non-linear interactions, handle stochastic events accounting for noise, and focus on local interactions, which can be related to causal inference. Here, we describe for the first time an algorithm that models biological pathways as BNs and identifies pathways that best explain given HTBD by scoring fitness of each network. Results: Proposed method takes into account the connectivity and relatedness between nodes of the p…

Statistics and ProbabilityComputer scienceHigh-throughput screeningGene regulatory networkcomputer.software_genreModels BiologicalBiochemistrySynthetic dataBiological pathwayBayes' theoremHumansGene Regulatory NetworksCarcinoma Renal CellMolecular BiologyGeneBiological dataMicroarray analysis techniquesGene Expression ProfilingBayesian networkRobustness (evolution)Bayes TheoremPathway analysisKidney NeoplasmsHigh-Throughput Screening AssaysComputer Science ApplicationsGene expression profilingComputational MathematicsComputational Theory and MathematicsCausal inferenceData miningcomputerAlgorithmsSoftwareBioinformatics

researchProduct

Algorithms and tools for protein-protein interaction networks clustering, with a special focus on population-based stochastic methods

2014

Abstract Motivation: Protein–protein interaction (PPI) networks are powerful models to represent the pairwise protein interactions of the organisms. Clustering PPI networks can be useful for isolating groups of interacting proteins that participate in the same biological processes or that perform together specific biological functions. Evolutionary orthologies can be inferred this way, as well as functions and properties of yet uncharacterized proteins. Results: We present an overview of the main state-of-the-art clustering methods that have been applied to PPI networks over the past decade. We distinguish five specific categories of approaches, describe and compare their main features and …

Statistics and ProbabilityComputer sciencePopulationPopulation basedMachine learningcomputer.software_genreBiochemistryProtein protein interaction networkgenetic algorithmsProtein–protein interactionBioinformatics Clustering Biological NetworksPPI networkscomplex detectionProtein Interaction MappingAnimalsCluster AnalysisHumanseducationCluster analysisMolecular BiologyTopology (chemistry)Class (computer programming)education.field_of_studybusiness.industryfood and beveragesProteinsComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsArtificial intelligenceData miningbusinessFocus (optics)computerAlgorithms

researchProduct

mRNAStab—a web application for mRNA stability analysis

2013

Abstract Eukaryotic gene expression is regulated both at the transcription and the mRNA degradation levels. The implementation of functional genomics methods that allow the simultaneous measurement of transcription (TR) and degradation (DR) rates for thousands of mRNAs is a huge improvement in this field. One of the best established methods for mRNA stability determination is genomic run-on (GRO). It allows the measurement of DR, TR and mRNA levels during cell dynamic responses. Here, we offer a software package that provides improved algorithms for determination of mRNA stability during dynamic GRO experiments. Availability and implementation: The program mRNAStab is freely accessible at h…

Statistics and ProbabilityComputer scienceRNA StabilityCellComputational biologyBioinformaticsBiochemistryTranscription (biology)Gene expressionMRNA degradationmedicineHumansWeb applicationRNA MessengerMolecular BiologyInternetMessenger RNAbusiness.industryRNAGenomicsComputer Science ApplicationsComputational Mathematicsmedicine.anatomical_structureComputational Theory and MathematicsMrna levelbusinessFunctional genomicsAlgorithmsSoftwareBioinformatics

researchProduct

Acceleration of short and long DNA read mapping without loss of accuracy using suffix array

2014

HPG Aligner applies suffix arrays for DNA read mapping. This implementation produces a highly sensitive and extremely fast mapping of DNA reads that scales up almost linearly with read length. The approach presented here is faster (over 20 for long reads) and more sensitive (over 98% in a wide range of read lengths) than the current state-of-the-art mappers. HPG Aligner is not only an optimal alternative for current sequencers but also the only solution available to cope with longer reads and growing throughputs produced by forthcoming sequencing technologies.

Statistics and ProbabilityComputer scienceSequence analysisSequence alignmentdatabase searchescomputer.software_genreBiochemistrylaw.inventionAccelerationchemistry.chemical_compoundlawCIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIALAnimalsHumansMolecular BiologyDatabasesequencing dataSuffix arraySequence analysisHigh-Throughput Nucleotide SequencingalignmentSequence Analysis DNAApplications NotesComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicschemistryDrosophilaSuffixSequence AlignmentcomputerAlgorithmAlgorithmsSoftwareDNA

researchProduct

Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data

2012

Abstract Motivation: The imperfect sequence data produced by next-generation sequencing technologies have motivated the development of a number of short-read error correctors in recent years. The majority of methods focus on the correction of substitution errors, which are the dominant error source in data produced by Illumina sequencing technology. Existing tools either score high in terms of recall or precision but not consistently high in terms of both measures. Results: In this article, we present Musket, an efficient multistage k-mer-based corrector for Illumina short-read data. We use the k-mer spectrum approach and introduce three correction techniques in a multistage workflow: two-s…

Statistics and ProbabilityComputer sciencebusiness.industrySequence assemblySequence Analysis DNAMusketBiochemistryComputer Science ApplicationsComputational MathematicsCUDASoftwareComputational Theory and Mathematicsk-merEscherichia coliChromosomes HumanHumansbusinessFocus (optics)Molecular BiologyAlgorithmAlgorithmsGenome BacterialSoftwareIllumina dye sequencingBioinformatics

researchProduct

Modeling temperature effects on mortality: multiple segmented relationships with common break points.

2008

We present a model for estimation of temperature effects on mortality that is able to capture jointly the typical features of every temperature-death relationship, that is, nonlinearity and delayed effect of cold and heat over a few days. Using a segmented approximation along with a doubly penalized spline-based distributed lag parameterization, estimates and relevant standard errors of the cold- and heat-related risks and the heat tolerance are provided. The model is applied to data from Milano, Italy.

Statistics and ProbabilityDistributed lagHot TemperatureTime FactorsInjury controlPoison controltemperature effectRisk FactorsStatisticsHumansSegmented regressionMortalitysegmented regressionWeatherSimulationMathematicsLikelihood FunctionsModels StatisticalTemperatureGeneral MedicineHeat toleranceCold TemperatureSpline (mathematics)Nonlinear systemStandard errorItalyNonlinear DynamicsLinear ModelsRegression AnalysisStatistics Probability and Uncertaintybreak pointSettore SECS-S/01 - StatisticaAlgorithmsBiostatistics (Oxford, England)

researchProduct

Reducing the effect of the data order in algorithms for constructing phylogenetic trees.

1988

Statistics and ProbabilityElectronic Data ProcessingTheoretical computer sciencePhylogenetic treeComputer scienceBiochemistryComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsMolecular BiologyAlgorithmAlgorithmsPhylogenySoftwareComputer applications in the biosciences : CABIOS

researchProduct

Tailoring sparse multivariable regression techniques for prognostic single-nucleotide polymorphism signatures.

2011

When seeking prognostic information for patients, modern technologies provide a huge amount of genomic measurements as a starting point. For single-nucleotide polymorphisms (SNPs), there may be more than one million covariates that need to be simultaneously considered with respect to a clinical endpoint. Although the underlying biological problem cannot be solved on the basis of clinical cohorts of only modest size, some important SNPs might still be identified. Sparse multivariable regression techniques have recently become available for automatically identifying prognostic molecular signatures that comprise relatively few covariates and provide reasonable prediction performance. For illus…

Statistics and ProbabilityEpidemiologyComputer scienceFeature selectionBiostatisticscomputer.software_genrePolymorphism Single NucleotideLasso (statistics)Gene FrequencyResamplingCovariateHumansLikelihood FunctionsModels StatisticalMultivariable calculusRegression analysisGenomicsPrognosisRegressionMinor allele frequencyLeukemia Myeloid AcuteMultivariate AnalysisRegression AnalysisData miningcomputerAlgorithmsStatistics in medicine

researchProduct

Adaptive reference-free compression of sequence quality scores

2014

Motivation: Rapid technological progress in DNA sequencing has stimulated interest in compressing the vast datasets that are now routinely produced. Relatively little attention has been paid to compressing the quality scores that are assigned to each sequence, even though these scores may be harder to compress than the sequences themselves. By aggregating a set of reads into a compressed index, we find that the majority of bases can be predicted from the sequence of bases that are adjacent to them and hence are likely to be less informative for variant calling or other applications. The quality scores for such bases are aggressively compressed, leaving a relatively small number at full reso…

Statistics and ProbabilityFOS: Computer and information sciencesComputer sciencemedia_common.quotation_subjectReference-freecomputer.software_genreBiochemistryDNA sequencingSet (abstract data type)Redundancy (information theory)BWTComputer Science - Data Structures and AlgorithmsCode (cryptography)AnimalsHumansQuality (business)Data Structures and Algorithms (cs.DS)Quantitative Biology - GenomicsCaenorhabditis elegansMolecular Biologymedia_commonGenomics (q-bio.GN)SequenceGenomeSettore INF/01 - Informaticareference-free compressionHigh-Throughput Nucleotide SequencingGenomicsSequence Analysis DNAData CompressioncompressionComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsFOS: Biological sciencesData miningquality scoreMetagenomicscomputerBWT; compression; quality score; reference-free compressionAlgorithmsReference genome

researchProduct

Sparse kernel methods for high-dimensional survival data

2008

Abstract Sparse kernel methods like support vector machines (SVM) have been applied with great success to classification and (standard) regression settings. Existing support vector classification and regression techniques however are not suitable for partly censored survival data, which are typically analysed using Cox's proportional hazards model. As the partial likelihood of the proportional hazards model only depends on the covariates through inner products, it can be ‘kernelized’. The kernelized proportional hazards model however yields a solution that is dense, i.e. the solution depends on all observations. One of the key features of an SVM is that it yields a sparse solution, dependin…

Statistics and ProbabilityLung NeoplasmsLymphomaComputer sciencecomputer.software_genreComputing MethodologiesBiochemistryPattern Recognition AutomatedArtificial IntelligenceMargin (machine learning)CovariateCluster AnalysisHumansComputer SimulationFraction (mathematics)Molecular BiologyProportional Hazards ModelsModels StatisticalTraining setProportional hazards modelGene Expression ProfilingComputational BiologyComputer Science ApplicationsSupport vector machineComputational MathematicsKernel methodComputational Theory and MathematicsRegression AnalysisData miningcomputerAlgorithmsSoftwareBioinformatics

researchProduct