Search results for "Reinforcement learning"
showing 10 items of 95 documents
Breaking adiabatic quantum control with deep learning
2020
In the era of digital quantum computing, optimal digitized pulses are requisite for efficient quantum control. This goal is translated into dynamic programming, in which a deep reinforcement learning (DRL) agent is gifted. As a reference, shortcuts to adiabaticity (STA) provide analytical approaches to adiabatic speed up by pulse control. Here, we select single-component control of qubits, resembling the ubiquitous two-level Landau-Zener problem for gate operation. We aim at obtaining fast and robust digital pulses by combining STA and DRL algorithm. In particular, we find that DRL leads to robust digital quantum control with operation time bounded by quantum speed limits dictated by STA. I…
Pleasurable music affects reinforcement learning according to the listener.
2013
Mounting evidence links the enjoyment of music to brain areas implicated in emotion and the dopaminergic reward system. In particular, dopamine release in the ventral striatum seems to play a major role in the rewarding aspect of music listening. Striatal dopamine also influences reinforcement learning, such that subjects with greater dopamine efficacy learn better to approach rewards while those with lesser dopamine efficacy learn better to avoid punishments. In this study, we explored the practical implications of musical pleasure through its ability to facilitate reinforcement learning via non-pharmacological dopamine elicitation. Subjects from a wide variety of musical backgrounds chose…
Aberrant probabilistic reinforcement learning in first-degree relatives of individuals with bipolar disorder
2020
Contains fulltext : 215845.pdf (Publisher’s version ) (Closed access) Background: Motivational dysregulation represents a core vulnerability factor for bipolar disorder. Whether this also comprises aberrant learning of stimulus-reinforcer contingencies is less clear. Methods: To answer this question, we compared healthy first-degree relatives of individuals with bipolar disorder (n = 42) known to convey an increased risk of developing a bipolar spectrum disorder and healthy individuals (n = 97). Further, we investigated the effects of the behavioral activation system (BAS) on reinforcement learning across the entire sample. All participants were assessed with a probabilistic learning task t…
Use of Reinforcement Learning in Two Real Applications
2008
In this paper, we present two sucessful applications of Reinforcement Learning (RL) in real life. First, the optimization of anemia management in patients undergoing Chronic Renal Failure is presented. The aim is to individualize the treatment (Erythropoietin dosages) in order to stabilize patients within a targeted range of Hemoglobin (Hb). Results show that the use of RL increases the ratio of patients within the desired range of Hb. Thus, patients' quality of life is increased, and additionally, Health Care System reduces its expenses in anemia management. Second, RL is applied to modify a marketing campaign in order to maximize long-term profits. RL obtains an individualized policy depe…
A Deep Reinforcement Learning scheme for Battery Energy Management
2020
Deep reinforcement learning is considered promising for many energy cost optimization tasks in smart buildings. How-ever, agent learning, in this context, is sometimes unstable and unpredictable, especially when the environments are complex. In this paper, we examine deep Reinforcement Learning (RL) algorithms developed for game play applied to a battery control task with an energy cost optimization objective. We explore how agent behavior and hyperparameters can be analyzed in a simplified environment with the goal of modifying the algorithms for increased stability. Our modified Deep Deterministic Policy Gradient (DDPG) agent is able to perform consistently close to the optimum over multi…
Notice of Violation of IEEE Publication Principles: Reinforcement learning for P2P searching
2005
For a peer-to-peer (P2P) system holding a massive amount of data, an efficient and scalable search for resource sharing is a key determinant to its practical usage. Unstructured P2P networks avoid the limitations of centralized systems and the drawbacks of a highly structured approach, because they impose few constraints on topology and data placement, and they support highly versatile search mechanisms. However their search algorithms are usually based on simple flooding schemes, showing severe inefficiencies. In this paper, to address this major limitation, we propose and evaluate the adoption of a local adaptive routing protocol. The routing algorithm adopts a simple reinforcement learni…
Discretized Bayesian Pursuit – A New Scheme for Reinforcement Learning
2012
Published version of a chapter in the book: Advanced Research in Applied Artificial Intelligence. Also available from the publisher at: http://dx.doi.org/10.1007/978-3-642-31087-4_79 The success of Learning Automata (LA)-based estimator algorithms over the classical, Linear Reward-Inaction ( L RI )-like schemes, can be explained by their ability to pursue the actions with the highest reward probability estimates. Without access to reward probability estimates, it makes sense for schemes like the L RI to first make large exploring steps, and then to gradually turn exploration into exploitation by making progressively smaller learning steps. However, this behavior becomes counter-intuitive wh…
A Learning Automata Local Contribution Sampling Applied to Hydropower Production Optimisation
2017
Learning Automata (LA) is a powerful approach for solving complex, non-linear and stochastic optimisation problems. However, existing solutions struggle with high-dimensional problems due to slow convergence, arguably caused by the global nature of feedback. In this paper we introduce a novel Learning Automata (LA) scheme to attack this challenge. The scheme is based on a parallel form of Local Contribution Sampling (LCS), which means that the LA receive individually directed feedback, designed to speed up convergence. Furthermore, our scheme is highly decentralized, allowing parallel execution on GPU architectures. To demonstrate the power of our scheme, the LA LCS is applied to hydropower…
Thompson Sampling Guided Stochastic Searching on the Line for Adversarial Learning
2015
The multi-armed bandit problem has been studied for decades. In brief, a gambler repeatedly pulls one out of N slot machine arms, randomly receiving a reward or a penalty from each pull. The aim of the gambler is to maximize the expected number of rewards received, when the probabilities of receiving rewards are unknown. Thus, the gambler must, as quickly as possible, identify the arm with the largest probability of producing rewards, compactly capturing the exploration-exploitation dilemma in reinforcement learning. In this paper we introduce a particular challenging variant of the multi-armed bandit problem, inspired by the so-called N-Door Puzzle. In this variant, the gambler is only tol…
A comparison between a two feedback control loop and a reinforcement learning algorithm for compliant low-cost series elastic actuators
2020
Highly-compliant elastic actuators have become progressively prominent over the last years for a variety of robotic applications. With remarkable shock tolerance, elastic actuators are appropriate for robots operating in unstructured environments. In accordance with this trend, a novel elastic actuator was recently designed by our research group for Serpens, a low-cost, open-source and highly-compliant multi-purpose modular snake robot. To control the newly designed elastic actuators of Serpens, a two-feedback loops position control algorithm was proposed. The inner controller loop is implemented as a model reference adaptive controller (MRAC), while the outer control loop adopts a fuzzy pr…