Search results for " processing"
showing 10 items of 7549 documents
An LP-based hyperparameter optimization model for language modeling
2018
In order to find hyperparameters for a machine learning model, algorithms such as grid search or random search are used over the space of possible values of the models hyperparameters. These search algorithms opt the solution that minimizes a specific cost function. In language models, perplexity is one of the most popular cost functions. In this study, we propose a fractional nonlinear programming model that finds the optimal perplexity value. The special structure of the model allows us to approximate it by a linear programming model that can be solved using the well-known simplex algorithm. To the best of our knowledge, this is the first attempt to use optimization techniques to find per…
A Bayesian Multilevel Random-Effects Model for Estimating Noise in Image Sensors
2020
Sensor noise sources cause differences in the signal recorded across pixels in a single image and across multiple images. This paper presents a Bayesian approach to decomposing and characterizing the sensor noise sources involved in imaging with digital cameras. A Bayesian probabilistic model based on the (theoretical) model for noise sources in image sensing is fitted to a set of a time-series of images with different reflectance and wavelengths under controlled lighting conditions. The image sensing model is a complex model, with several interacting components dependent on reflectance and wavelength. The properties of the Bayesian approach of defining conditional dependencies among parame…
Heretical Mutiple Importance Sampling
2016
Multiple Importance Sampling (MIS) methods approximate moments of complicated distributions by drawing samples from a set of proposal distributions. Several ways to compute the importance weights assigned to each sample have been recently proposed, with the so-called deterministic mixture (DM) weights providing the best performance in terms of variance, at the expense of an increase in the computational cost. A recent work has shown that it is possible to achieve a trade-off between variance reduction and computational effort by performing an a priori random clustering of the proposals (partial DM algorithm). In this paper, we propose a novel "heretical" MIS framework, where the clustering …
The Inconsistent Labelling Problem of Stutter-Preserving Partial-Order Reduction
2020
AbstractIn model checking, partial-order reduction (POR) is an effective technique to reduce the size of the state space. Stubborn sets are an established variant of POR and have seen many applications over the past 31 years. One of the early works on stubborn sets shows that a combination of several conditions on the reduction is sufficient to preserve stutter-trace equivalence, making stubborn sets suitable for model checking of linear-time properties. In this paper, we identify a flaw in the reasoning and show with a counter-example that stutter-trace equivalence is not necessarily preserved. We propose a solution together with an updated correctness proof. Furthermore, we analyse in whi…
The IceProd framework: distributed data processing for the IceCube neutrino observatory
2015
IceCube is a one-gigaton instrument located at the geographic South Pole, designed to detect cosmic neutrinos, identify the particle nature of dark matter, and study high-energy neutrinos themselves. Simulation of the IceCube detector and processing of data require a significant amount of computational resources. This paper presents the first detailed description of IceProd, a lightweight distributed management system designed to meet these requirements. It is driven by a central database in order to manage mass production of simulations and analysis of data produced by the IceCube detector. IceProd runs as a separate layer on top of other middleware and can take advantage of a variety of c…
The Recycling Gibbs sampler for efficient learning
2018
Monte Carlo methods are essential tools for Bayesian inference. Gibbs sampling is a well-known Markov chain Monte Carlo (MCMC) algorithm, extensively used in signal processing, machine learning, and statistics, employed to draw samples from complicated high-dimensional posterior distributions. The key point for the successful application of the Gibbs sampler is the ability to draw efficiently samples from the full-conditional probability density functions. Since in the general case this is not possible, in order to speed up the convergence of the chain, it is required to generate auxiliary samples whose information is eventually disregarded. In this work, we show that these auxiliary sample…
Multispectral image denoising with optimized vector non-local mean filter
2016
Nowadays, many applications rely on images of high quality to ensure good performance in conducting their tasks. However, noise goes against this objective as it is an unavoidable issue in most applications. Therefore, it is essential to develop techniques to attenuate the impact of noise, while maintaining the integrity of relevant information in images. We propose in this work to extend the application of the Non-Local Means filter (NLM) to the vector case and apply it for denoising multispectral images. The objective is to benefit from the additional information brought by multispectral imaging systems. The NLM filter exploits the redundancy of information in an image to remove noise. A …
Extending the Unmixing methods to Multispectral Images
2021
In the past few decades, there has been intensive research concerning the Unmixing of hyperspectral images. Some methods such as NMF, VCA, and N-FINDR have become standards since they show robustness in dealing with the unmixing of hyperspectral images. However, the research concerning the unmixing of multispectral images is relatively scarce. Thus, we extend some unmixing methods to the multispectral images. In this paper, we have created two simulated multispectral datasets from two hyperspectral datasets whose ground truths are given. Then we apply the unmixing methods (VCA, NMF, N-FINDR) to these two datasets. By comparing and analyzing the results, we have been able to demonstrate some…
Pattern statistics in faro words and permutations
2021
We study the distribution and the popularity of some patterns in $k$-ary faro words, i.e. words over the alphabet $\{1, 2, \ldots, k\}$ obtained by interlacing the letters of two nondecreasing words of lengths differing by at most one. We present a bijection between these words and dispersed Dyck paths (i.e. Motzkin paths with all level steps on the $x$-axis) with a given number of peaks. We show how the bijection maps statistics of consecutive patterns of faro words into linear combinations of other pattern statistics on paths. Then, we deduce enumerative results by providing multivariate generating functions for the distribution and the popularity of patterns of length at most three. Fina…
RIGA at SemEval-2016 Task 8: Impact of Smatch Extensions and Character-Level Neural Translation on AMR Parsing Accuracy
2016
Two extensions to the AMR smatch scoring script are presented. The first extension com-bines the smatch scoring script with the C6.0 rule-based classifier to produce a human-readable report on the error patterns frequency observed in the scored AMR graphs. This first extension results in 4% gain over the state-of-art CAMR baseline parser by adding to it a manually crafted wrapper fixing the identified CAMR parser errors. The second extension combines a per-sentence smatch with an en-semble method for selecting the best AMR graph among the set of AMR graphs for the same sentence. This second modification au-tomatically yields further 0.4% gain when ap-plied to outputs of two nondeterministic…