Search results for "force"
showing 10 items of 3423 documents
Assigning discounts in a marketing campaign by using reinforcement learning and neural networks
2009
In this work, RL is used to find an optimal policy for a marketing campaign. Data show a complex characterization of state and action spaces. Two approaches are proposed to circumvent this problem. The first approach is based on the self-organizing map (SOM), which is used to aggregate states. The second approach uses a multilayer perceptron (MLP) to carry out a regression of the action-value function. The results indicate that both approaches can improve a targeted marketing campaign. Moreover, the SOM approach allows an intuitive interpretation of the results, and the MLP approach yields robust results with generalization capabilities.
Safer Reinforcement Learning for Agents in Industrial Grid-Warehousing
2020
In mission-critical, real-world environments, there is typically a low threshold for failure, which makes interaction with learning algorithms particularly challenging. Here, current state-of-the-art reinforcement learning algorithms struggle to learn optimal control policies safely. Loss of control follows, which could result in equipment breakages and even personal injuries.
Increasing sample efficiency in deep reinforcement learning using generative environment modelling
2020
Road Detection for Reinforcement Learning Based Autonomous Car
2020
Human mistakes in traffic often have terrible consequences. The long-awaited introduction of self-driving vehicles may solve many of the problems with traffic, but much research is still needed before cars are fully autonomous.In this paper, we propose a new Road Detection algorithm using online supervised learning based on a Neural Network architecture. This algorithm is designed to support a Reinforcement Learning algorithm (for example, the standard Proximal Policy Optimization or PPO) by detecting when the car is in an adverse condition. Specifically, the PPO gets a penalty whenever the virtual automobile gets stuck or drives off the road with any of its four wheels.Initial experiments …
CostNet: An End-to-End Framework for Goal-Directed Reinforcement Learning
2020
Reinforcement Learning (RL) is a general framework concerned with an agent that seeks to maximize rewards in an environment. The learning typically happens through trial and error using explorative methods, such as \(\epsilon \)-greedy. There are two approaches, model-based and model-free reinforcement learning, that show concrete results in several disciplines. Model-based RL learns a model of the environment for learning the policy while model-free approaches are fully explorative and exploitative without considering the underlying environment dynamics. Model-free RL works conceptually well in simulated environments, and empirical evidence suggests that trial and error lead to a near-opti…
Interannual and decadal SST-forced responses of the West African monsoon
2010
International audience; We review the studies carried out during the African Monsoon Multidisciplinary Analysis (AMMA)-EU on the changes of interannual sea surface temperature (SST)-West African monsoon (WAM) covariability at multidecadal timescales, together with the influence of global warming (GW). The results obtained in the AMMA-EU suggest the importance of the background state, modulated by natural and anthropogenic variability, in the appearance of different interannual modes. The lack of reliability of current coupled models in giving a realistic assessment for WAM in the future is also stated.
Mean Radiant Temperature Measurements through Small Black Globes under Forced Convection Conditions
2021
One of the most critical variables in the field of thermal comfort measurements is the mean radiant temperature which is typically measured with a standard 150 mm black globe thermometer. This is also the reference instrument required for the assessment of heat stress conditions by means of the well-known Wet Bulb Globe Temperature index (WBGT). However, one of the limitations of this method is represented by the relatively long response time. This is why in recent years there has been a more and more pressing need of smart sensors for controlling Heating Ventilation and Air Conditioning (HVAC) systems, and for pocket heat stress meters (e.g., WBGT meters provided with table tennis balls). …
Compact two-electron wave function for bond dissociation and Van der Waals interactions: A natural amplitude assessment
2014
Electron correlations in molecules can be divided in short range dynamical correlations, long range Van der Waals type interactions and near degeneracy static correlations. In this work we analyze for a one-dimensional model of a two-electron system how these three types of correlations can be incorporated in a simple wave function of restricted functional form consisting of an orbital product multiplied by a single correlation function $f(r_{12})$ depending on the interelectronic distance $r_{12}$. Since the three types of correlations mentioned lead to different signatures in terms of the natural orbital (NO) amplitudes in two-electron systems we make an analysis of the wave function in t…
Invasive Observation by Atomic Force Microscope of a Langmuir-Blodgett Monolayer of Gramicidin
2002
The properties of gramicidin, a linear antibiotic polypeptide of 15 amino acids, have been studied at the air-water interface. Analysis of the pressure-area isotherm is not able to conclude about the conformational behavior of gramicidin in the monolayer. Langmuir-Blodgett deposition of gramicidin layers onto a mica substrate has been developed for atomic force microscopy (AFM) observations. At high pressure of deposition, the gramicidin monolayer is composed of dimers perpendicular to the surface. The possibility of removing the half upper part of this dimer monolayer with the AFM tip is more in favor of a structure of single-stranded helical dimers.