Search results for "Thompson"
showing 10 items of 29 documents
Successive Reduction of Arms in Multi-Armed Bandits
2011
The relevance of the multi-armed bandit problem has risen in the past few years with the need for online optimization techniques in Internet systems, such as online advertisement and news article recommendation. At the same time, these applications reveal that state-of-the-art solution schemes do not scale well with the number of bandit arms. In this paper, we present two types of Successive Reduction (SR) strategies - 1) Successive Reduction Hoeffding (SRH) and 2) Successive Reduction Order Statistics (SRO). Both use an Order Statistics based Thompson Sampling method for arm selection, and then successively eliminate bandit arms from consideration based on a confidence threshold. While SRH…
Solving Non-Stationary Bandit Problems by Random Sampling from Sibling Kalman Filters
2010
Published version of an article from Lecture Notes in Computer Science. Also available at SpringerLink: http://dx.doi.org/10.1007/978-3-642-13033-5_21 The multi-armed bandit problem is a classical optimization problem where an agent sequentially pulls one of multiple arms attached to a gambling machine, with each pull resulting in a random reward. The reward distributions are unknown, and thus, one must balance between exploiting existing knowledge about the arms, and obtaining new information. Dynamically changing (non-stationary) bandit problems are particularly challenging because each change of the reward distributions may progressively degrade the performance of any fixed strategy. Alt…
Thompson Sampling Guided Stochastic Searching on the Line for Adversarial Learning
2015
The multi-armed bandit problem has been studied for decades. In brief, a gambler repeatedly pulls one out of N slot machine arms, randomly receiving a reward or a penalty from each pull. The aim of the gambler is to maximize the expected number of rewards received, when the probabilities of receiving rewards are unknown. Thus, the gambler must, as quickly as possible, identify the arm with the largest probability of producing rewards, compactly capturing the exploration-exploitation dilemma in reinforcement learning. In this paper we introduce a particular challenging variant of the multi-armed bandit problem, inspired by the so-called N-Door Puzzle. In this variant, the gambler is only tol…
Properties of Design-Based Functional Principal Components Analysis.
2010
This work aims at performing Functional Principal Components Analysis (FPCA) with Horvitz-Thompson estimators when the observations are curves collected with survey sampling techniques. One important motivation for this study is that FPCA is a dimension reduction tool which is the first step to develop model assisted approaches that can take auxiliary information into account. FPCA relies on the estimation of the eigenelements of the covariance operator which can be seen as nonlinear functionals. Adapting to our functional context the linearization technique based on the influence function developed by Deville (1999), we prove that these estimators are asymptotically design unbiased and con…
Uniform convergence and asymptotic confidence bands for model-assisted estimators of the mean of sampled functional data
2013
When the study variable is functional and storage capacities are limited or transmission costs are high, selecting with survey sampling techniques a small fraction of the observations is an interesting alternative to signal compression techniques, particularly when the goal is the estimation of simple quantities such as means or totals. We extend, in this functional framework, model-assisted estimators with linear regression models that can take account of auxiliary variables whose totals over the population are known. We first show, under weak hypotheses on the sampling design and the regularity of the trajectories, that the estimator of the mean function as well as its variance estimator …
Design-based estimation for geometric quantiles with application to outlier detection
2010
Geometric quantiles are investigated using data collected from a complex survey. Geometric quantiles are an extension of univariate quantiles in a multivariate set-up that uses the geometry of multivariate data clouds. A very important application of geometric quantiles is the detection of outliers in multivariate data by means of quantile contours. A design-based estimator of geometric quantiles is constructed and used to compute quantile contours in order to detect outliers in both multivariate data and survey sampling set-ups. An algorithm for computing geometric quantile estimates is also developed. Under broad assumptions, the asymptotic variance of the quantile estimator is derived an…
Using Complex Surveys to Estimate theL1-Median of a Functional Variable: Application to Electricity Load Curves
2012
Mean proles are widely used as indicators of the electricity consumption habits of customers. Currently, Electricit e De France (EDF), estimates class load proles by using point-wise mean function. Unfortunately, it is well known that the mean is highly sensitive to the presence of outliers, such as one or more consumers with unusually high-levels of consumption. In this paper, we propose an alternative to the mean prole: the L1-median prole which is more robust. When dealing with large datasets of functional data (load curves for example), survey sampling approaches are useful for estimating the median prole and avoid storing all of the data. We propose here estimators of the median trajec…
Semiparametric Models with Functional Responses in a Model Assisted Survey Sampling Setting : Model Assisted Estimation of Electricity Consumption Cu…
2010
This work adopts a survey sampling point of view to estimate the mean curve of large databases of functional data. When storage capacities are limited, selecting, with survey techniques a small fraction of the observations is an interesting alternative to signal compression techniques. We propose here to take account of real or multivariate auxiliary information available at a low cost for the whole population, with semiparametric model assisted approaches, in order to improve the accuracy of Horvitz-Thompson estimators of the mean curve. We first estimate the functional principal components with a design based point of view in order to reduce the dimension of the signals and then propose s…
Rojun perimmäinen tarkoitus : säilyttämisen ja poisheittämisen ristiriita
1999
Survey sampling for functionnal data : building asymptotic confidence bands and considering auxiliary information
2011
When collections of functional data are too large to be exhaustively observed, survey sampling techniques provide an effective way to estimate global quantities such as the population mean function, without being obligated to store all the data. In this thesis, we propose a Horvitz–Thompson estimator of the mean trajectory, and with additional assumptions on the sampling design, we state a functional Central Limit Theorem and deduce asymptotic confidence bands. For a fixed sample size, we show that stratified sampling can greatly improve the estimation compared to simple random sampling. In addition, we extend Neyman’s rule of optimal allocation to the functional context. Taking into accoun…