Search results for "algorithm"
showing 10 items of 4887 documents
An approximation to maximum likelihood estimates in reduced models
1990
SUMMARY An approximation to the maximum likelihood estimates of the parameters in a model can be obtained from the corresponding estimates and information matrices in an extended model, i.e. a model with additional parameters. The approximation is close provided that the data are consistent with the first model. Applications are described to log linear models for discrete data, to models for multivariate normal distributions with special covariance matrices and to mixed discrete-continuous models.
Iterative Cluster Analysis of Protein Interaction Data
2004
Abstract Motivation: Generation of fast tools of hierarchical clustering to be applied when distances among elements of a set are constrained, causing frequent distance ties, as happens in protein interaction data. Results: We present in this work the program UVCLUSTER, that iteratively explores distance datasets using hierarchical clustering. Once the user selects a group of proteins, UVCLUSTER converts the set of primary distances among them (i.e. the minimum number of steps, or interactions, required to connect two proteins) into secondary distances that measure the strength of the connection between each pair of proteins when the interactions for all the proteins in the group are consid…
An Adaptive Parallel Tempering Algorithm
2013
Parallel tempering is a generic Markov chainMonteCarlo samplingmethod which allows good mixing with multimodal target distributions, where conventionalMetropolis- Hastings algorithms often fail. The mixing properties of the sampler depend strongly on the choice of tuning parameters, such as the temperature schedule and the proposal distribution used for local exploration. We propose an adaptive algorithm with fixed number of temperatures which tunes both the temperature schedule and the parameters of the random-walk Metropolis kernel automatically. We prove the convergence of the adaptation and a strong law of large numbers for the algorithm under general conditions. We also prove as a side…
A web application for the unspecific detection of differentially expressed DNA regions in strand-specific expression data
2015
Abstract Genomic technologies allow laboratories to produce large-scale data sets, either through the use of next-generation sequencing or microarray platforms. To explore these data sets and obtain maximum value from the data, researchers view their results alongside all the known features of a given reference genome. To study transcriptional changes that occur under a given condition, researchers search for regions of the genome that are differentially expressed between different experimental conditions. In order to identify these regions several algorithms have been developed over the years, along with some bioinformatic platforms that enable their use. However, currently available appli…
Multiple sequence editing by spreadsheet.
1990
Spreadsheets have several functions and facilities that make them good candidates to be used as multiple sequence editors. They can be easily programmed (even by non-programmers) with macros that allow them to fit the needs of the user, free of the restrictions that programs written by other people have. Here I present a sheet containing a set of macros written for Lotus 1-2-3
The Power of Word-Frequency Based Alignment-Free Functions: a Comprehensive Large-Scale Experimental Analysis
2021
Abstract Motivation Alignment-free (AF) distance/similarity functions are a key tool for sequence analysis. Experimental studies on real datasets abound and, to some extent, there are also studies regarding their control of false positive rate (Type I error). However, assessment of their power, i.e. their ability to identify true similarity, has been limited to some members of the D2 family. The corresponding experimental studies have concentrated on short sequences, a scenario no longer adequate for current applications, where sequence lengths may vary considerably. Such a State of the Art is methodologically problematic, since information regarding a key feature such as power is either mi…
Long read alignment based on maximal exact match seeds
2012
Abstract Motivation: The explosive growth of next-generation sequencing datasets poses a challenge to the mapping of reads to reference genomes in terms of alignment quality and execution speed. With the continuing progress of high-throughput sequencing technologies, read length is constantly increasing and many existing aligners are becoming inefficient as generated reads grow larger. Results: We present CUSHAW2, a parallelized, accurate, and memory-efficient long read aligner. Our aligner is based on the seed-and-extend approach and uses maximal exact matches as seeds to find gapped alignments. We have evaluated and compared CUSHAW2 to the three other long read aligners BWA-SW, Bowtie2 an…
Dimension reduction for time series in a blind source separation context using r
2021
Funding Information: The work of KN was supported by the CRoNoS COST Action IC1408 and the Austrian Science Fund P31881-N32. The work of ST was supported by the CRoNoS COST Action IC1408. The work of JV was supported by Academy of Finland (grant 321883). We would like to thank the anonymous reviewers for their comments which improved the paper and package considerably. Publisher Copyright: © 2021, American Statistical Association. All rights reserved. Multivariate time series observations are increasingly common in multiple fields of science but the complex dependencies of such data often translate into intractable models with large number of parameters. An alternative is given by first red…
Stochastic labelling of biological images
1998
Many hypotheses made by experimental researchers can be formulated as a stochastic labelling of a given image. Some stochastic labelling methods for random closed sets are proposed in this paper. Molchanov (I. Molchanov, 1984, Theor. Probability and Math. Statist.29, 113–119) provided the probabilistic background for this problem. However, there is a lack of specific labelling models. Ayala and Simo (G. Ayala and A. Simo, 1995, Advances in Applied Probability27, 293–305) proposed a method in which, given the whole set of connected components, every component is classified in a certain phase or category in a completely random way. Alternative methods are necessary in case the random labellin…
A tabu search algorithm for assigning teachers to courses
2002
In this paper we deal with the problem of assigning teachers to courses in a secondary school. The problem appears when a timetable is to be built and the teaching assignments are not fixed. We have developed a tabu search algorithm to solve the problem. The parameters involved in the algorithm have been estimated by using multiple regression techniques. The computational results, obtained on a set of Spanish secondary schools, show that the solutions obtained by this automatic procedure can be favourably compared with the solutions proposed by the experts.