6533b85cfe1ef96bd12bc7cb

RESEARCH PRODUCT

The Bayesian Learning Automaton — Empirical Evaluation with Two-Armed Bernoulli Bandit Problems

Ole-christoffer Granmo

subject

Balance (metaphysics)Optimization problemWake-sleep algorithmbusiness.industryBayesian inferenceMachine learningcomputer.software_genreAutomatonBernoulli's principleArtificial intelligencebusinessBeta distributioncomputerMathematics

description

The two-armed Bernoulli bandit (TABB) problem is a classical optimization problem where an agent sequentially pulls one of two arms attached to a gambling machine, with each pull resulting either in a reward or a penalty. The reward probabilities of each arm are unknown, and thus one must balance between exploiting existing knowledge about the arms, and obtaining new information.

https://doi.org/10.1007/978-1-84882-171-2_17