Search results for "Reinforcement"
showing 10 items of 230 documents
CB1 cannabinoid receptor-mediated aggressive behavior
2013
This study examined the role of cannabinoid CB1 receptors (CB1r) in aggressive behavior. Social encounters took place in grouped and isolated mice lacking CB1r (CB1KO) and in wild-type (WT) littermates. Cognitive impulsivity was evaluated in the delayed reinforcement task (DRT). Gene expression analyses of monoaminooxidase-A (MAO-A), catechol-o-methyl-transferase (COMT), 5-hydroxytriptamine transporter (5-HTT) and 5-HT1B serotonergic receptor (5HT1Br) in the median and dorsal raphe nuclei (MnR and DR, respectively) and in the amygdala (AMY) were performed by real time-PCR. Double immunohistochemistry studies evaluated COMT and CB1r co-localization in the raphe nuclei and in the cortical (AC…
Assigning discounts in a marketing campaign by using reinforcement learning and neural networks
2009
In this work, RL is used to find an optimal policy for a marketing campaign. Data show a complex characterization of state and action spaces. Two approaches are proposed to circumvent this problem. The first approach is based on the self-organizing map (SOM), which is used to aggregate states. The second approach uses a multilayer perceptron (MLP) to carry out a regression of the action-value function. The results indicate that both approaches can improve a targeted marketing campaign. Moreover, the SOM approach allows an intuitive interpretation of the results, and the MLP approach yields robust results with generalization capabilities.
Safer Reinforcement Learning for Agents in Industrial Grid-Warehousing
2020
In mission-critical, real-world environments, there is typically a low threshold for failure, which makes interaction with learning algorithms particularly challenging. Here, current state-of-the-art reinforcement learning algorithms struggle to learn optimal control policies safely. Loss of control follows, which could result in equipment breakages and even personal injuries.
Increasing sample efficiency in deep reinforcement learning using generative environment modelling
2020
Road Detection for Reinforcement Learning Based Autonomous Car
2020
Human mistakes in traffic often have terrible consequences. The long-awaited introduction of self-driving vehicles may solve many of the problems with traffic, but much research is still needed before cars are fully autonomous.In this paper, we propose a new Road Detection algorithm using online supervised learning based on a Neural Network architecture. This algorithm is designed to support a Reinforcement Learning algorithm (for example, the standard Proximal Policy Optimization or PPO) by detecting when the car is in an adverse condition. Specifically, the PPO gets a penalty whenever the virtual automobile gets stuck or drives off the road with any of its four wheels.Initial experiments …
CostNet: An End-to-End Framework for Goal-Directed Reinforcement Learning
2020
Reinforcement Learning (RL) is a general framework concerned with an agent that seeks to maximize rewards in an environment. The learning typically happens through trial and error using explorative methods, such as \(\epsilon \)-greedy. There are two approaches, model-based and model-free reinforcement learning, that show concrete results in several disciplines. Model-based RL learns a model of the environment for learning the policy while model-free approaches are fully explorative and exploitative without considering the underlying environment dynamics. Model-free RL works conceptually well in simulated environments, and empirical evidence suggests that trial and error lead to a near-opti…
On the feasibility of personal audio systems over a network of distributed loudspeakers
2018
Los sistemas de reproducción de audio personal se ocupan de la creación de zonas sonoras personales dentro de una habitación sin necesidad de utilizar auriculares. Estos sistemas utilizan un conjunto de altavoces y diseñan los filtros necesarios en cada altavoz con el fin de que la señal de audio deseada llegue a cada persona en la sala lo más libre de interferencias posible. Existen propuestas muy interesantes en la literatura que hacen uso de arrays circulares o lineales, pero en este trabajo estudiamos el problema considerando una red de altavoces distribuidos controlados por un conjunto de nodos acústicos, que pueden intercambiar información a través de una red. Enunciamos el modelo de …
Multitasking in Driving as Optimal Adaptation Under Uncertainty
2021
Objective The objective was to better understand how people adapt multitasking behavior when circumstances in driving change and how safe versus unsafe behaviors emerge. Background Multitasking strategies in driving adapt to changes in the task environment, but the cognitive mechanisms of this adaptation are not well known. Missing is a unifying account to explain the joint contribution of task constraints, goals, cognitive capabilities, and beliefs about the driving environment. Method We model the driver’s decision to deploy visual attention as a stochastic sequential decision-making problem and propose hierarchical reinforcement learning as a computationally tractable solution to it. The…
Durability assessment of basalt fiber polymer as reinforcement to expanded clay concrete in harsh environment
2021
Basalt fiber-reinforced polymer composites are receiving considerable attention as they represent a low-cost green source of raw materials. In most cases, fiber-reinforced polymer composites face harsh environments, such as chloride ions in coastal marine environments or cold regions with salt deicing. The resistance of fiber-reinforced polymers subjected to the above environments is critical for the safe design and application of such composites. This research aims to develop a framework to investigate the durability properties of the lightweight expanded clay basalt fiber polymer reinforced concrete exposed to the NaCl environment. The specified quantity of concrete structural elements wa…
Some Effects of Individual Learning on the Evolution of Sensors
2001
In this paper, we present an abstract model of sensor evolution, where sensor development is only determined by artificial evolution and the adaptation of agent reactions is accomplished by individual learning. With the environment cast into a MDP framework, sensors can be conceived as a map from environmental states to agent observations and Reinforcement Learning algorithms can be utilised. On the basis of a simple gridworld scenario, we present some results of the interaction between individual learning and evolution of sensors.