Search results for "Bandit"

showing 10 items of 18 documents

Simple learning rules to cope with changing environments

2008

10 pages; International audience; We consider an agent that must choose repeatedly among several actions. Each action has a certain probability of giving the agent an energy reward, and costs may be associated with switching between actions. The agent does not know which action has the highest reward probability, and the probabilities change randomly over time. We study two learning rules that have been widely used to model decision-making processes in animals-one deterministic and the other stochastic. In particular, we examine the influence of the rules' 'learning rate' on the agent's energy gain. We compare the performance of each rule with the best performance attainable when the agent …

0106 biological sciencesError-driven learningExploitComputer scienceEnergy (esotericism)Biomedical EngineeringBiophysicsBioengineeringanimal behavior010603 evolutionary biology01 natural sciencesBiochemistryMulti-armed banditModels Biologicaldecision makingBiomaterials03 medical and health sciences[ INFO.INFO-BI ] Computer Science [cs]/Bioinformatics [q-bio.QM][ SDV.EE.IEO ] Life Sciences [q-bio]/Ecology environment/SymbiosisAnimalsLearningComputer Simulation[ SDV.BIBS ] Life Sciences [q-bio]/Quantitative Methods [q-bio.QM]multi-armed banditEcosystem030304 developmental biologySimple (philosophy)0303 health sciences[ SDE.BE ] Environmental Sciences/Biodiversity and Ecologybusiness.industrydynamic environmentslearning rulesdecision-making[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM]Unlimited periodRange (mathematics)Action (philosophy)Artificial intelligence[SDE.BE]Environmental Sciences/Biodiversity and Ecology[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]businessBiotechnologyResearch Article[SDV.EE.IEO]Life Sciences [q-bio]/Ecology environment/Symbiosis

researchProduct

Thompson Sampling Guided Stochastic Searching on the Line for Non-stationary Adversarial Learning

2015

This paper reports the first known solution to the N-Door puzzle when the environment is both non-stationary and deceptive (adversarial learning). The Multi-Armed-Bandit (MAB) problem is the iconic representation of the exploration versus exploitation dilemma. In brief, a gambler repeatedly selects and play, one out of N possible slot machines or arms and either receives a reward or a penalty. The objective of the gambler is then to locate the most rewarding arm to play, while in the process maximize his winnings. In this paper we investigate a challenging variant of the MAB problem, namely the non-stationary N-Door puzzle. Here, instead of directly observing the reward, the gambler is only…

Adversarial systemComputer scienceProperty (programming)business.industryProcess (computing)Reinforcement learningArtificial intelligencebusinessRepresentation (mathematics)Bayesian inferenceMulti-armed banditThompson sampling2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)

researchProduct

Nemo teneatur ad impossibile. Las consecuencias de la pragmática para la extirpación del bandolerismo valenciano: cláusulas relativas a la punición d…

2014

Obsesionado con atajar a cualquier precio el problema del bandolerismo, a mediados de 1586 el virrey Aytona publicó en Valencia una pragmática que hacía recaer sobre los dueños de lugares y las autoridades municipales la responsabilidad principal de la lucha contra el crimen. A tenor de las cuantiosas multas que, en aplicación de la misma, se les impondrían durante los 18 años en que la norma estuvo en vigor, en particular por incumplir las cláusulas concernientes al esclarecimiento y sanción de homicidios, cabe concluir que la corona encontró en ella un poderoso instrumento para obligar a los señores y a las oligarquías locales a colaborar más estrechamente en la ardua tarea de asegurar la…

DP1-402Local GovernmentSeigneuryReligious studiesHistory of SpainJusticia CriminalClash of JurisdictionsHistoria ModernaModern history 1453-BandolerismoSeñoríoBanditryD204-475MunicipioHomicidioConflicto de JurisdicciónCriminal JusticeDerecho PenalHomicideProcurador FiscalProsecutionPenal LawRevista de Historia Moderna

researchProduct

Near surface seismostratigraphic modelling of the Bandita plain in Palermo town (Italy) from integra-ted analysis of HVSR and stratigraphic data

2016

The Horizontal to Vertical Spectral Ratio (HVSR) noise method (Nakamura, 1989) is nowadays widely used to estimate the resonance frequencies of geological structures (Bonnefoy-Claudet, 2006). However, often HVSR is also used to obtain information on the depth of the seismic bedrock and on thickness and seismic velocity of the process overburden deposits, using inversion techniques of the H/V curve (Fäh et al., 2003). This nevertheless produce results with large uncertainty intervals of parameters, and then must be necessarily constrained by detailed stratigraphic information. An application of HVSR inversion is presented in order to verify the effectiveness of this technique for purposes of…

HVSRSettore GEO/02 - Geologia Stratigrafica E SedimentologicaSettore GEO/11 - Geofisica ApplicataBanditaSeismostratigraphyic modelingPalermo

researchProduct

Pliegos poéticos de bandoleros en la Cataluña del barroco. Un ejemplo de literatura propagandística

2018

En aquest estudi s’analitza la literatura popular de bandolers de la Catalunya del segle xvii. El bandolerisme sigué un fenomen important en les primeres dècades del barroc en el Principat i la seua repercussió arribà, de manera notable, a la literatura de cordell de l’època. Aquests plecs de cordell, que tenien la missió d’arribar a tots els públics, compten amb un nombre reduït d’estudis, si tenim en compte la seua importància. Per eixa raó, l’article analitza els dos principals moments històrics que propiciaren el major auge d’edició de plecs de bandolers en vers a Catalunya, com foren les signatures de les unions i dels agermanaments contra els bandolers (1606) i el sistema repressiu de…

HistoryBandolersBandolerisme; bandolers; literatura de canya; cordell; Catalunya; literatura popularCataloniaChapbooksBanditCatalunyabandolerismo; bandolero; literatura de cordel; Cataluña; literatura popularCordellBandoleroBandolerismeBandolerismoBanditryLiteratura de cordelLiteratura popularCataluñabanditry; bandit; Chapbooks; Catalonia; popular literaturePopular literatureLiteratura de canyaManuscrits: revista d'història moderna

researchProduct

An AI for dominion based on Monte-Carlo methods

2014

Masteroppgave i Informasjons- og kommunikasjonsteknologi IKT590 Universitetet i Agder 2014 To the best of our knowledge there exists no Arti_cial Intelligence (AI)for Dominion which uses Monte Carlo methods, that is competitive on ahuman level. This thesis presents such an AI, and tests it against someof the top Dominion strategies available. Although in a limited testingenvironment, the results show that our AI is capable of competing withhuman players, while keeping processing time per move at an acceptablelevel for human players. Although the approach for our AI is built onprevious knowledge about Upper Con_dence Bounds (UCB) and UCBapplied to Trees (UCT), an approach for handling the st…

IKT590VDP::Technology: 500::Information and communication technology: 550Dominion ; UCT ; UCB ; AI ; Multi-Armed Bandit Problem ; Monte-Carlo ; Tree Search

researchProduct

Perfiles básicos del bandolerismo morisco valenciano: del desarme a la expulsión (1563-1609)

2009

espanolA la imagen del problema del bandolerismo morisco valenciano legada por Sebastian Garcia Martinez se contrapone en estas paginas una vision alternativa, basada en el empleo de dos fuentes principales: los libros de cuentas del Maestre Racional y las conclusiones criminales de la Real Audiencia. Dos son los aspectos fundamentales que se revisan: la geografia del fenomeno, a partir de la distincion entre los lugares de origen de los fuera de la ley y los escenarios donde perpetraron sus crimenes, y su evolucion desde 1563 hasta 1609, periodo a lo largo del cual pueden diferenciarse varias fases, tanto desde la perspectiva de la actividad delictiva, como desde la de la energia represiva…

Justicia criminalPena capitalmedia_common.quotation_subjectViolencePenal justiceCapital punishmentBandolerismoMinoríasD204-475Social unrestmedia_commonMinoritiesDP1-402Religious studiesHistory of SpainArtHistoria ModernaConflictividad socialModern history 1453-BanditryCriminal lawMoriscosViolenciaDerecho penalHumanitiesCartography

researchProduct

Alocucion de ... Pio ... VI tenida en el Consistorio secreto dia 13 de Noviembre 1775 de la preciosa muerte de Jacinto Castañeda español i Vicente …

1775

Corren exemplars sense port., que comencen amb una carta introductòria de Fr. Francisco Ruiz, Provincial de la Província d'Aragó i que inclouen, al verso de la p. 11, una oració del dit Provincial Escut xil. de Pius VI en les dues port. enfrentades Sign.: [ ]8 Notes a peu de pàg Reclams Doble port., en llatí i en castellà, i text bilingüe a dues col.: llatí i castellà

Paz Vicente de la (O.P.) Obres anteriors al 1800Boxadors Juan Tomás de (m.1780) obres anteriors a 1800Castañeda y Pujassons Jacinto (1743-1773)) Obres anteriors a 1800Paz Vicente de la (O.P.) Obres anteriors a 1800Boncompagni Ludovisi Ignazio (1743-1799) Obres anteriors a 1800Oracions fúnebres 1775 Obres anteriors a 1800Banditi Francesco Maria (1705-1796) Obres anteriors a 1800Boxadors Juan Tomás de m.1780 Obres anteriors al 1800Boncompagni Ludovisi Ignazio 1743-1799 Obres anteriors al 1800Castañeda y Pujassons Jacinto 1743-1773 Obres anteriors al 1800Martirs dominicans Tonquín (Vietnam) Obres anteriors al 1800Oracions fúnebres 1775 Obres anteriors al 1800Martirs dominicans Tonquín (Vietnam) Obres anteriors a 1800Banditi Francesco Maria 1705-1796 Obres anteriors al 1800

researchProduct

Solving Non-Stationary Bandit Problems by Random Sampling from Sibling Kalman Filters

2010

Published version of an article from Lecture Notes in Computer Science. Also available at SpringerLink: http://dx.doi.org/10.1007/978-3-642-13033-5_21 The multi-armed bandit problem is a classical optimization problem where an agent sequentially pulls one of multiple arms attached to a gambling machine, with each pull resulting in a random reward. The reward distributions are unknown, and thus, one must balance between exploiting existing knowledge about the arms, and obtaining new information. Dynamically changing (non-stationary) bandit problems are particularly challenging because each change of the reward distributions may progressively degrade the performance of any fixed strategy. Alt…

Scheme (programming language)Mathematical optimizationOptimization problemComputer scienceBayesian probabilityVDP::Technology: 500::Information and communication technology: 550Kalman filterBayesian inferenceMulti-armed banditVDP::Mathematics and natural science: 400::Information and communication science: 420::Knowledge based systems: 425computerThompson samplingOptimal decisioncomputer.programming_language

researchProduct

Thompson Sampling Guided Stochastic Searching on the Line for Adversarial Learning

2015

The multi-armed bandit problem has been studied for decades. In brief, a gambler repeatedly pulls one out of N slot machine arms, randomly receiving a reward or a penalty from each pull. The aim of the gambler is to maximize the expected number of rewards received, when the probabilities of receiving rewards are unknown. Thus, the gambler must, as quickly as possible, identify the arm with the largest probability of producing rewards, compactly capturing the exploration-exploitation dilemma in reinforcement learning. In this paper we introduce a particular challenging variant of the multi-armed bandit problem, inspired by the so-called N-Door Puzzle. In this variant, the gambler is only tol…

Scheme (programming language)business.industryComputer scienceBayesian probabilityBayesian inferenceMulti-armed banditLine (geometry)Reinforcement learningArtificial intelligenceRepresentation (mathematics)businessThompson samplingcomputercomputer.programming_language

researchProduct