Search results for "algorithm"

showing 10 items of 4887 documents

An approximation to maximum likelihood estimates in reduced models

1990

SUMMARY An approximation to the maximum likelihood estimates of the parameters in a model can be obtained from the corresponding estimates and information matrices in an extended model, i.e. a model with additional parameters. The approximation is close provided that the data are consistent with the first model. Applications are described to log linear models for discrete data, to models for multivariate normal distributions with special covariance matrices and to mixed discrete-continuous models.

Statistics and ProbabilityRestricted maximum likelihoodApplied MathematicsGeneral MathematicsMaximum likelihoodMultivariate normal distributionMaximum likelihood sequence estimationCovarianceAgricultural and Biological Sciences (miscellaneous)Extended modelStatisticsExpectation–maximization algorithmLog-linear modelStatistics Probability and UncertaintyGeneral Agricultural and Biological SciencesMathematicsBiometrika

researchProduct

Iterative Cluster Analysis of Protein Interaction Data

2004

Abstract Motivation: Generation of fast tools of hierarchical clustering to be applied when distances among elements of a set are constrained, causing frequent distance ties, as happens in protein interaction data. Results: We present in this work the program UVCLUSTER, that iteratively explores distance datasets using hierarchical clustering. Once the user selects a group of proteins, UVCLUSTER converts the set of primary distances among them (i.e. the minimum number of steps, or interactions, required to connect two proteins) into secondary distances that measure the strength of the connection between each pair of proteins when the interactions for all the proteins in the group are consid…

Statistics and ProbabilitySaccharomyces cerevisiae ProteinsComputer sciencecomputer.software_genreBiochemistryInteractomePattern Recognition AutomatedSet (abstract data type)Protein Interaction MappingCluster (physics)Cluster AnalysisCluster analysisMolecular BiologyCytoskeletonMeasure (data warehouse)Gene Expression ProfilingProteinsActinsComputer Science ApplicationsHierarchical clusteringGene expression profilingComputational MathematicsComputational Theory and MathematicsPattern recognition (psychology)Benchmark (computing)Data miningcomputerAlgorithmsSoftwareSignal TransductionBioinformatics

researchProduct

An Adaptive Parallel Tempering Algorithm

2013

Parallel tempering is a generic Markov chainMonteCarlo samplingmethod which allows good mixing with multimodal target distributions, where conventionalMetropolis- Hastings algorithms often fail. The mixing properties of the sampler depend strongly on the choice of tuning parameters, such as the temperature schedule and the proposal distribution used for local exploration. We propose an adaptive algorithm with fixed number of temperatures which tunes both the temperature schedule and the parameters of the random-walk Metropolis kernel automatically. We prove the convergence of the adaptation and a strong law of large numbers for the algorithm under general conditions. We also prove as a side…

Statistics and ProbabilityScheduleMathematical optimizationta112Adaptive algorithmErgodicityta111Mixing (mathematics)Law of large numbersKernel (statistics)Convergence (routing)Discrete Mathematics and CombinatoricsParallel temperingStatistics Probability and UncertaintyAlgorithmMathematicsJournal of Computational and Graphical Statistics

researchProduct

A web application for the unspecific detection of differentially expressed DNA regions in strand-specific expression data

2015

Abstract Genomic technologies allow laboratories to produce large-scale data sets, either through the use of next-generation sequencing or microarray platforms. To explore these data sets and obtain maximum value from the data, researchers view their results alongside all the known features of a given reference genome. To study transcriptional changes that occur under a given condition, researchers search for regions of the genome that are differentially expressed between different experimental conditions. In order to identify these regions several algorithms have been developed over the years, along with some bioinformatic platforms that enable their use. However, currently available appli…

Statistics and ProbabilitySequence analysisADNGenomicsComputational biologyBiologycomputer.software_genreBiochemistryGenomeComputer GraphicsExpressió genèticaWeb applicationHumansMolecular BiologyGeneInternetMicroarray analysis techniquesbusiness.industryGenome HumanGene Expression ProfilingComputational BiologyHigh-Throughput Nucleotide SequencingDNAGenomicsSequence Analysis DNAComputer Science ApplicationsGene expression profilingComputational MathematicsGenòmicaComputingMethodologies_PATTERNRECOGNITIONComputational Theory and MathematicsData miningbusinesscomputerAlgorithmsGenèticaReference genome

researchProduct

Multiple sequence editing by spreadsheet.

1990

Spreadsheets have several functions and facilities that make them good candidates to be used as multiple sequence editors. They can be easily programmed (even by non-programmers) with macros that allow them to fit the needs of the user, free of the restrictions that programs written by other people have. Here I present a sheet containing a set of macros written for Lotus 1-2-3

Statistics and ProbabilitySequenceBase SequenceProgramming languagebusiness.industryComputer sciencecomputer.software_genreBiochemistryComputer Science ApplicationsSet (abstract data type)Computational MathematicsSoftwareComputational Theory and MathematicsSoftware DesignMicrocomputerNucleic AcidsSoftware designMacrobusinessMolecular BiologycomputerAlgorithmSoftwareComputer applications in the biosciences : CABIOS

researchProduct

The Power of Word-Frequency Based Alignment-Free Functions: a Comprehensive Large-Scale Experimental Analysis

2021

Abstract Motivation Alignment-free (AF) distance/similarity functions are a key tool for sequence analysis. Experimental studies on real datasets abound and, to some extent, there are also studies regarding their control of false positive rate (Type I error). However, assessment of their power, i.e. their ability to identify true similarity, has been limited to some members of the D2 family. The corresponding experimental studies have concentrated on short sequences, a scenario no longer adequate for current applications, where sequence lengths may vary considerably. Such a State of the Art is methodologically problematic, since information regarding a key feature such as power is either mi…

Statistics and ProbabilitySequenceSimilarity (geometry)Settore INF/01 - Informaticasequence analysisComputer sciencepower statisticsAlignment-Free Genomic Analysis Big Data Software Platforms Bioinformatics AlgorithmsScale (descriptive set theory)Function (mathematics)computer.software_genreBiochemistryComputer Science ApplicationsSet (abstract data type)Computational MathematicsRange (mathematics)Computational Theory and Mathematicssequence analysis; power statistics; alignment-free functionsalignment-free functionsData miningCompleteness (statistics)Molecular BiologycomputerType I and type II errors

researchProduct

Long read alignment based on maximal exact match seeds

2012

Abstract Motivation: The explosive growth of next-generation sequencing datasets poses a challenge to the mapping of reads to reference genomes in terms of alignment quality and execution speed. With the continuing progress of high-throughput sequencing technologies, read length is constantly increasing and many existing aligners are becoming inefficient as generated reads grow larger. Results: We present CUSHAW2, a parallelized, accurate, and memory-efficient long read aligner. Our aligner is based on the seed-and-extend approach and uses maximal exact matches as seeds to find gapped alignments. We have evaluated and compared CUSHAW2 to the three other long read aligners BWA-SW, Bowtie2 an…

Statistics and ProbabilitySequencing and Sequence AnalysisTheoretical computer scienceGenomicsBiologyBiochemistrySoftwareHumansMolecular BiologyAlignment-free sequence analysisExact matchSupplementary dataGenome Humanbusiness.industryChromosome MappingHigh-Throughput Nucleotide SequencingGenomicsSequence Analysis DNAOriginal PapersComputer Science ApplicationsComputational MathematicsComputational Theory and MathematicsComputer engineeringScalabilitybusinessSequence AlignmentAlgorithmsSoftwareBioinformatics

researchProduct

Dimension reduction for time series in a blind source separation context using r

2021

Funding Information: The work of KN was supported by the CRoNoS COST Action IC1408 and the Austrian Science Fund P31881-N32. The work of ST was supported by the CRoNoS COST Action IC1408. The work of JV was supported by Academy of Finland (grant 321883). We would like to thank the anonymous reviewers for their comments which improved the paper and package considerably. Publisher Copyright: © 2021, American Statistical Association. All rights reserved. Multivariate time series observations are increasingly common in multiple fields of science but the complex dependencies of such data often translate into intractable models with large number of parameters. An alternative is given by first red…

Statistics and ProbabilitySeries (mathematics)Stochastic volatilityComputer scienceblind source separation; supervised dimension reduction; RsignaalinkäsittelyDimensionality reductionRsignaalianalyysiContext (language use)CovarianceBlind signal separationQA273-280aikasarja-analyysiR-kieliDimension (vector space)monimuuttujamenetelmätBlind source separationStatistics Probability and UncertaintyTime seriesAlgorithmSoftwareSupervised dimension reduction

researchProduct

Stochastic labelling of biological images

1998

Many hypotheses made by experimental researchers can be formulated as a stochastic labelling of a given image. Some stochastic labelling methods for random closed sets are proposed in this paper. Molchanov (I. Molchanov, 1984, Theor. Probability and Math. Statist.29, 113–119) provided the probabilistic background for this problem. However, there is a lack of specific labelling models. Ayala and Simo (G. Ayala and A. Simo, 1995, Advances in Applied Probability27, 293–305) proposed a method in which, given the whole set of connected components, every component is classified in a certain phase or category in a completely random way. Alternative methods are necessary in case the random labellin…

Statistics and ProbabilitySet (abstract data type)Connected componentDiscrete mathematicsClosed setLabellingComponent (UML)Probabilistic logicFunction (mathematics)Statistics Probability and UncertaintyAlgorithmMathematicsImage (mathematics)Statistica Neerlandica

researchProduct

A tabu search algorithm for assigning teachers to courses

2002

In this paper we deal with the problem of assigning teachers to courses in a secondary school. The problem appears when a timetable is to be built and the teaching assignments are not fixed. We have developed a tabu search algorithm to solve the problem. The parameters involved in the algorithm have been estimated by using multiple regression techniques. The computational results, obtained on a set of Spanish secondary schools, show that the solutions obtained by this automatic procedure can be favourably compared with the solutions proposed by the experts.

Statistics and ProbabilitySet (abstract data type)Mathematical optimizationInformation Systems and ManagementModeling and SimulationComputingMilieux_COMPUTERSANDEDUCATIONDiscrete Mathematics and CombinatoricsGuided Local SearchManagement Science and Operations ResearchHeuristicsAlgorithmTabu searchMathematicsTop

researchProduct