Search results for "Reinforcement"

showing 10 items of 230 documents

Warm elaborated feedback. Exploring its benefits on post-feedback behaviour

2019

This study provides evidence on the impact of including warm messages in elaborated feedback. These messages are aimed at the motivational process that can be mobilised by feedback and that which c...

Process (engineering)Human–computer interaction05 social sciencesDevelopmental and Educational Psychology050301 educationComputer-Assisted Instruction0501 psychology and cognitive sciencesExperimental and Cognitive PsychologyReinforcementPsychology0503 education050104 developmental & child psychologyEducationEducational Psychology

researchProduct

Management of Spinal Bone Metastases With Radiofrequency Ablation, Vertebral Reinforcement and Transpedicular Fixation: A Retrospective Single-Center…

2022

Spine is a frequent site of bone metastases, with a 8.5 months median survival time after diagnosis. In most cases treatment is only palliative. Several advanced techniques can ensure a better Quality of Life (QoL) and increase life expectancy. Radiofrequency ablation (RFA) uses alternating current to produce local heating and necrosis of the spinal lesion, preserving the healthy bone. RFA is supported by vertebral reinforcement through kyphoplasty and vertebroplasty in order to stabilize the fracture with polymethylmethacrylate (PMMA) injection, restoring vertebral body height and reducing the weakness of healthy bone. The aim of this study is to demonstrate the efficacy and advantages of …

RFACancer Researchvertebral reinforcementOncologySettore MED/27 - NeurochirurgiaNeoplasms. Tumors. Oncology. Including cancer and carcinogensspinal fixationPMMARC254-282Original Researchspinal metastasesFrontiers in oncology

researchProduct

Use of Reinforcement Learning in Two Real Applications

2008

In this paper, we present two sucessful applications of Reinforcement Learning (RL) in real life. First, the optimization of anemia management in patients undergoing Chronic Renal Failure is presented. The aim is to individualize the treatment (Erythropoietin dosages) in order to stabilize patients within a targeted range of Hemoglobin (Hb). Results show that the use of RL increases the ratio of patients within the desired range of Hb. Thus, patients' quality of life is increased, and additionally, Health Care System reduces its expenses in anemia management. Second, RL is applied to modify a marketing campaign in order to maximize long-term profits. RL obtains an individualized policy depe…

Range (mathematics)Quality of life (healthcare)business.industryComputer scienceOrder (business)Robustness (computer science)Health careReinforcement learningIn patientOperations managementMarketing campaignbusinessSimulation

researchProduct

A Deep Reinforcement Learning scheme for Battery Energy Management

2020

Deep reinforcement learning is considered promising for many energy cost optimization tasks in smart buildings. How-ever, agent learning, in this context, is sometimes unstable and unpredictable, especially when the environments are complex. In this paper, we examine deep Reinforcement Learning (RL) algorithms developed for game play applied to a battery control task with an energy cost optimization objective. We explore how agent behavior and hyperparameters can be analyzed in a simplified environment with the goal of modifying the algorithms for increased stability. Our modified Deep Deterministic Policy Gradient (DDPG) agent is able to perform consistently close to the optimum over multi…

Reduction (complexity)Task (computing)Mathematical optimizationArtificial neural networkComputer sciencebusiness.industryDeep learningStability (learning theory)Reinforcement learningContext (language use)Artificial intelligencebusinessAverage cost2020 5th International Conference on Smart and Sustainable Technologies (SpliTech)

researchProduct

Notice of Violation of IEEE Publication Principles: Reinforcement learning for P2P searching

2005

For a peer-to-peer (P2P) system holding a massive amount of data, an efficient and scalable search for resource sharing is a key determinant to its practical usage. Unstructured P2P networks avoid the limitations of centralized systems and the drawbacks of a highly structured approach, because they impose few constraints on topology and data placement, and they support highly versatile search mechanisms. However their search algorithms are usually based on simple flooding schemes, showing severe inefficiencies. In this paper, to address this major limitation, we propose and evaluate the adoption of a local adaptive routing protocol. The routing algorithm adopts a simple reinforcement learni…

Routing protocolSmall-world networkComputer scienceSearch algorithmbusiness.industryDistributed computingScalabilityReinforcement learningbusinessNetwork topologyComputer networkShared resourceFlooding (computer networking)Seventh International Workshop on Computer Architecture for Machine Perception (CAMP'05)

researchProduct

Discretized Bayesian Pursuit – A New Scheme for Reinforcement Learning

2012

Published version of a chapter in the book: Advanced Research in Applied Artificial Intelligence. Also available from the publisher at: http://dx.doi.org/10.1007/978-3-642-31087-4_79 The success of Learning Automata (LA)-based estimator algorithms over the classical, Linear Reward-Inaction ( L RI )-like schemes, can be explained by their ability to pursue the actions with the highest reward probability estimates. Without access to reward probability estimates, it makes sense for schemes like the L RI to first make large exploring steps, and then to gradually turn exploration into exploitation by making progressively smaller learning steps. However, this behavior becomes counter-intuitive wh…

Scheme (programming language)Mathematical optimizationDiscretizationLearning automataComputer sciencebusiness.industryVDP::Mathematics and natural science: 400::Information and communication science: 420::Algorithms and computability theory: 422estimator algorithmsBayesian probabilityBayesian reasoninglearning automataEstimatorVDP::Technology: 500::Information and communication technology: 550discretized learningBayesian inferenceAction (physics)Reinforcement learningArtificial intelligencepursuit schemesbusinesscomputercomputer.programming_language

researchProduct

A Learning Automata Local Contribution Sampling Applied to Hydropower Production Optimisation

2017

Learning Automata (LA) is a powerful approach for solving complex, non-linear and stochastic optimisation problems. However, existing solutions struggle with high-dimensional problems due to slow convergence, arguably caused by the global nature of feedback. In this paper we introduce a novel Learning Automata (LA) scheme to attack this challenge. The scheme is based on a parallel form of Local Contribution Sampling (LCS), which means that the LA receive individually directed feedback, designed to speed up convergence. Furthermore, our scheme is highly decentralized, allowing parallel execution on GPU architectures. To demonstrate the power of our scheme, the LA LCS is applied to hydropower…

Scheme (programming language)Mathematical optimizationEngineeringSpeedupLearning automatabusiness.industrySampling (statistics)Machine learningcomputer.software_genrePower (physics)Range (mathematics)Convergence (routing)Reinforcement learningArtificial intelligencebusinesscomputercomputer.programming_language

researchProduct

Thompson Sampling Guided Stochastic Searching on the Line for Adversarial Learning

2015

The multi-armed bandit problem has been studied for decades. In brief, a gambler repeatedly pulls one out of N slot machine arms, randomly receiving a reward or a penalty from each pull. The aim of the gambler is to maximize the expected number of rewards received, when the probabilities of receiving rewards are unknown. Thus, the gambler must, as quickly as possible, identify the arm with the largest probability of producing rewards, compactly capturing the exploration-exploitation dilemma in reinforcement learning. In this paper we introduce a particular challenging variant of the multi-armed bandit problem, inspired by the so-called N-Door Puzzle. In this variant, the gambler is only tol…

Scheme (programming language)business.industryComputer scienceBayesian probabilityBayesian inferenceMulti-armed banditLine (geometry)Reinforcement learningArtificial intelligenceRepresentation (mathematics)businessThompson samplingcomputercomputer.programming_language

researchProduct

Round Robin Testing initiative for fiber reinforced polymer (FRP) reinforcement

2011

An international Round Robin Testing (RRT) programm e on FRP reinforcement was conducted within the framework of the Marie Curie R esearch Training Network, EN- CORE, and with the support of Task Group 9.3 of the International Federation for Structural Concrete (fib). Eleven laboratories and six manufacturers and suppliers participated in this exercise. As part of this exte nsive experimental endeavour, one or more of the following tests were performed by the partic ipating laboratories: 1) tensile tests on FRP bars and strips; 2) tensile tests on FRP lamina tes; 3) double bond shear tests on FRP laminates (Externally Bonded Reinforcement, EBR) an d FRP bars/strip (Near Surface Mounted rein…

Science & TechnologyBond testArchitecture2300 Environmental Science (all)Bond test; EBR; FRP; NSM; RRT; Civil and Structural Engineering; Building and Construction; Architecture2300 Environmental Science (all)Near Surface Mounted reinforcement (NSM)EBRBuilding and ConstructionExternally Bonded Reinforcement (EBR)NSMFRPRRTCivil and Structural Engineering

researchProduct

A comparison between a two feedback control loop and a reinforcement learning algorithm for compliant low-cost series elastic actuators

2020

Highly-compliant elastic actuators have become progressively prominent over the last years for a variety of robotic applications. With remarkable shock tolerance, elastic actuators are appropriate for robots operating in unstructured environments. In accordance with this trend, a novel elastic actuator was recently designed by our research group for Serpens, a low-cost, open-source and highly-compliant multi-purpose modular snake robot. To control the newly designed elastic actuators of Serpens, a two-feedback loops position control algorithm was proposed. The inner controller loop is implemented as a model reference adaptive controller (MRAC), while the outer control loop adopts a fuzzy pr…

Series (mathematics)Computer sciencebusiness.industryFeedback controlRoboticsLoop (topology)Computer Science::RoboticsVDP::Teknologi: 500Control theoryReinforcement learningArtificial intelligenceReinforcement learning algorithmActuatorbusiness

researchProduct