Search results for "Reinforcement learning"

showing 10 items of 95 documents

Breaking adiabatic quantum control with deep learning

2020

In the era of digital quantum computing, optimal digitized pulses are requisite for efficient quantum control. This goal is translated into dynamic programming, in which a deep reinforcement learning (DRL) agent is gifted. As a reference, shortcuts to adiabaticity (STA) provide analytical approaches to adiabatic speed up by pulse control. Here, we select single-component control of qubits, resembling the ubiquitous two-level Landau-Zener problem for gate operation. We aim at obtaining fast and robust digital pulses by combining STA and DRL algorithm. In particular, we find that DRL leads to robust digital quantum control with operation time bounded by quantum speed limits dictated by STA. I…

PhysicsQuantum PhysicsSpeedupbusiness.industryDeep learningFOS: Physical sciences01 natural sciences010305 fluids & plasmasRobustness (computer science)Qubit0103 physical sciencesReinforcement learningArtificial intelligence010306 general physicsbusinessAdiabatic processQuantum Physics (quant-ph)QuantumAlgorithmPhysical Review A

researchProduct

Pleasurable music affects reinforcement learning according to the listener.

2013

Mounting evidence links the enjoyment of music to brain areas implicated in emotion and the dopaminergic reward system. In particular, dopamine release in the ventral striatum seems to play a major role in the rewarding aspect of music listening. Striatal dopamine also influences reinforcement learning, such that subjects with greater dopamine efficacy learn better to approach rewards while those with lesser dopamine efficacy learn better to avoid punishments. In this study, we explored the practical implications of musical pleasure through its ability to facilitate reinforcement learning via non-pharmacological dopamine elicitation. Subjects from a wide variety of musical backgrounds chose…

PleasureDopamineAffective neuroscienceEVERYDAY LIFE0302 clinical medicinePARKINSONS-DISEASEReinforcement learningDOPAMINE RELEASEsubjectivityReinforcement learningPsychologyBRAIN-REGIONSOriginal Research ArticleGeneral Psychologyrewardmedia_commonCORRELATEMusic psychology05 social scienceshumanitiesdopaminePsychologypsychological phenomena and processesCognitive psychologyPREDICT INDIVIDUAL-DIFFERENCESreinforcement learningMusic therapymedia_common.quotation_subjectlcsh:BF1-990pleasurebehavioral disciplines and activities050105 experimental psychologyMECHANISMSPleasure03 medical and health sciencesReward systemRewardEMOTION0501 psychology and cognitive sciencesActive listeningmusicmusical experienceListening strategySubjectivitylcsh:PsychologyMusic and emotionhuman activitiesMusic030217 neurology & neurosurgeryRESPONSESMusical experiencelistening strategyFrontiers in psychology

researchProduct

Aberrant probabilistic reinforcement learning in first-degree relatives of individuals with bipolar disorder

2020

Contains fulltext : 215845.pdf (Publisher’s version ) (Closed access) Background: Motivational dysregulation represents a core vulnerability factor for bipolar disorder. Whether this also comprises aberrant learning of stimulus-reinforcer contingencies is less clear. Methods: To answer this question, we compared healthy first-degree relatives of individuals with bipolar disorder (n = 42) known to convey an increased risk of developing a bipolar spectrum disorder and healthy individuals (n = 97). Further, we investigated the effects of the behavioral activation system (BAS) on reinforcement learning across the entire sample. All participants were assessed with a probabilistic learning task t…

ProbandBipolar Disordereducation03 medical and health sciences0302 clinical medicineRewardNegative feedbackmedicineReinforcement learningHumansSpectrum disorderBipolar disorder111 000 Intention & ActionFirst-degree relativesMotivationAction intention and motor controlBehavioral activationmedicine.disease030227 psychiatrySubstance abusePsychiatry and Mental healthClinical PsychologyAttention Deficit Disorder with HyperactivityPsychologyReinforcement Psychology030217 neurology & neurosurgeryClinical psychology

researchProduct

Use of Reinforcement Learning in Two Real Applications

2008

In this paper, we present two sucessful applications of Reinforcement Learning (RL) in real life. First, the optimization of anemia management in patients undergoing Chronic Renal Failure is presented. The aim is to individualize the treatment (Erythropoietin dosages) in order to stabilize patients within a targeted range of Hemoglobin (Hb). Results show that the use of RL increases the ratio of patients within the desired range of Hb. Thus, patients' quality of life is increased, and additionally, Health Care System reduces its expenses in anemia management. Second, RL is applied to modify a marketing campaign in order to maximize long-term profits. RL obtains an individualized policy depe…

Range (mathematics)Quality of life (healthcare)business.industryComputer scienceOrder (business)Robustness (computer science)Health careReinforcement learningIn patientOperations managementMarketing campaignbusinessSimulation

researchProduct

A Deep Reinforcement Learning scheme for Battery Energy Management

2020

Deep reinforcement learning is considered promising for many energy cost optimization tasks in smart buildings. How-ever, agent learning, in this context, is sometimes unstable and unpredictable, especially when the environments are complex. In this paper, we examine deep Reinforcement Learning (RL) algorithms developed for game play applied to a battery control task with an energy cost optimization objective. We explore how agent behavior and hyperparameters can be analyzed in a simplified environment with the goal of modifying the algorithms for increased stability. Our modified Deep Deterministic Policy Gradient (DDPG) agent is able to perform consistently close to the optimum over multi…

Reduction (complexity)Task (computing)Mathematical optimizationArtificial neural networkComputer sciencebusiness.industryDeep learningStability (learning theory)Reinforcement learningContext (language use)Artificial intelligencebusinessAverage cost2020 5th International Conference on Smart and Sustainable Technologies (SpliTech)

researchProduct

Notice of Violation of IEEE Publication Principles: Reinforcement learning for P2P searching

2005

For a peer-to-peer (P2P) system holding a massive amount of data, an efficient and scalable search for resource sharing is a key determinant to its practical usage. Unstructured P2P networks avoid the limitations of centralized systems and the drawbacks of a highly structured approach, because they impose few constraints on topology and data placement, and they support highly versatile search mechanisms. However their search algorithms are usually based on simple flooding schemes, showing severe inefficiencies. In this paper, to address this major limitation, we propose and evaluate the adoption of a local adaptive routing protocol. The routing algorithm adopts a simple reinforcement learni…

Routing protocolSmall-world networkComputer scienceSearch algorithmbusiness.industryDistributed computingScalabilityReinforcement learningbusinessNetwork topologyComputer networkShared resourceFlooding (computer networking)Seventh International Workshop on Computer Architecture for Machine Perception (CAMP'05)

researchProduct

Discretized Bayesian Pursuit – A New Scheme for Reinforcement Learning

2012

Published version of a chapter in the book: Advanced Research in Applied Artificial Intelligence. Also available from the publisher at: http://dx.doi.org/10.1007/978-3-642-31087-4_79 The success of Learning Automata (LA)-based estimator algorithms over the classical, Linear Reward-Inaction ( L RI )-like schemes, can be explained by their ability to pursue the actions with the highest reward probability estimates. Without access to reward probability estimates, it makes sense for schemes like the L RI to first make large exploring steps, and then to gradually turn exploration into exploitation by making progressively smaller learning steps. However, this behavior becomes counter-intuitive wh…

Scheme (programming language)Mathematical optimizationDiscretizationLearning automataComputer sciencebusiness.industryVDP::Mathematics and natural science: 400::Information and communication science: 420::Algorithms and computability theory: 422estimator algorithmsBayesian probabilityBayesian reasoninglearning automataEstimatorVDP::Technology: 500::Information and communication technology: 550discretized learningBayesian inferenceAction (physics)Reinforcement learningArtificial intelligencepursuit schemesbusinesscomputercomputer.programming_language

researchProduct

A Learning Automata Local Contribution Sampling Applied to Hydropower Production Optimisation

2017

Learning Automata (LA) is a powerful approach for solving complex, non-linear and stochastic optimisation problems. However, existing solutions struggle with high-dimensional problems due to slow convergence, arguably caused by the global nature of feedback. In this paper we introduce a novel Learning Automata (LA) scheme to attack this challenge. The scheme is based on a parallel form of Local Contribution Sampling (LCS), which means that the LA receive individually directed feedback, designed to speed up convergence. Furthermore, our scheme is highly decentralized, allowing parallel execution on GPU architectures. To demonstrate the power of our scheme, the LA LCS is applied to hydropower…

Scheme (programming language)Mathematical optimizationEngineeringSpeedupLearning automatabusiness.industrySampling (statistics)Machine learningcomputer.software_genrePower (physics)Range (mathematics)Convergence (routing)Reinforcement learningArtificial intelligencebusinesscomputercomputer.programming_language

researchProduct

Thompson Sampling Guided Stochastic Searching on the Line for Adversarial Learning

2015

The multi-armed bandit problem has been studied for decades. In brief, a gambler repeatedly pulls one out of N slot machine arms, randomly receiving a reward or a penalty from each pull. The aim of the gambler is to maximize the expected number of rewards received, when the probabilities of receiving rewards are unknown. Thus, the gambler must, as quickly as possible, identify the arm with the largest probability of producing rewards, compactly capturing the exploration-exploitation dilemma in reinforcement learning. In this paper we introduce a particular challenging variant of the multi-armed bandit problem, inspired by the so-called N-Door Puzzle. In this variant, the gambler is only tol…

Scheme (programming language)business.industryComputer scienceBayesian probabilityBayesian inferenceMulti-armed banditLine (geometry)Reinforcement learningArtificial intelligenceRepresentation (mathematics)businessThompson samplingcomputercomputer.programming_language

researchProduct

A comparison between a two feedback control loop and a reinforcement learning algorithm for compliant low-cost series elastic actuators

2020

Highly-compliant elastic actuators have become progressively prominent over the last years for a variety of robotic applications. With remarkable shock tolerance, elastic actuators are appropriate for robots operating in unstructured environments. In accordance with this trend, a novel elastic actuator was recently designed by our research group for Serpens, a low-cost, open-source and highly-compliant multi-purpose modular snake robot. To control the newly designed elastic actuators of Serpens, a two-feedback loops position control algorithm was proposed. The inner controller loop is implemented as a model reference adaptive controller (MRAC), while the outer control loop adopts a fuzzy pr…

Series (mathematics)Computer sciencebusiness.industryFeedback controlRoboticsLoop (topology)Computer Science::RoboticsVDP::Teknologi: 500Control theoryReinforcement learningArtificial intelligenceReinforcement learning algorithmActuatorbusiness

researchProduct