Search results for "Reinforcement"

showing 10 items of 230 documents

Reinforcement learning approach to nonequilibrium quantum thermodynamics

2021

We use a reinforcement learning approach to reduce entropy production in a closed quantum system brought out of equilibrium. Our strategy makes use of an external control Hamiltonian and a policy gradient technique. Our approach bears no dependence on the quantitative tool chosen to characterize the degree of thermodynamic irreversibility induced by the dynamical process being considered, require little knowledge of the dynamics itself and does not need the tracking of the quantum state of the system during the evolution, thus embodying an experimentally non-demanding approach to the control of non-equilibrium quantum thermodynamics. We successfully apply our methods to the case of single- …

---Computer scienceFOS: Physical sciencesGeneral Physics and AstronomyNon-equilibrium thermodynamics01 natural sciencesSettore FIS/03 - Fisica Della Materia010305 fluids & plasmassymbols.namesakeQuantum stateSHORTCUTS0103 physical sciencesQuantum systemReinforcement learningStatistical physics010306 general physicsQuantum thermodynamicsCondensed Matter - Statistical MechanicsADIABATICITYQuantum PhysicsStatistical Mechanics (cond-mat.stat-mech)Entropy productionENTROPYsymbolsQuantum Physics (quant-ph)Hamiltonian (quantum mechanics)
researchProduct

Exploratory behaviour is not related to associative learning ability in the carabid beetle Nebria brevicollis.

2020

Abstract Recently, it has been hypothesised that as learning performance and animal personality vary along a common axis of fast and slow types, natural selection may act on both in parallel leading to a correlation between learning and personality traits. We examined the relationship between risk-taking, exploratory behaviour and associative learning ability in carabid beetle Nebria brevicollis females by quantifying the number of trials individuals required to reach criterion during an associative learning task (‘learning performance’). The associative learning task required the females to associate odour and direction with refugia from light and heat in a T-maze. Further, we assessed lea…

0106 biological sciencesmedia_common.quotation_subjecteducationReversal Learning010603 evolutionary biology01 natural sciencesCorrelationBehavioral NeuroscienceCognitionNebria brevicollisPersonalityAnimalsHumansLearning0501 psychology and cognitive sciences050102 behavioral science & comparative psychologyBig Five personality traitsReinforcementAssociation (psychology)media_commonbiology05 social sciencesCognitionGeneral Medicinebiology.organism_classificationAssociative learningColeopteraExploratory BehaviorAnimal Science and ZoologyFemalePsychologyCognitive psychologyPersonalityBehavioural processes
researchProduct

MARL-Ped+Hitmap: Towards Improving Agent-Based Simulations with Distributed Arrays

2016

Multi-agent systems allow the modelling of complex, heterogeneous, and distributed systems in a realistic way. MARL-Ped is a multi-agent system tool, based on the MPI standard, for the simulation of different scenarios of pedestrians who autonomously learn the best behavior by Reinforcement Learning. MARL-Ped uses one MPI process for each agent by design, with a fixed fine-grain granularity. This requirement limits the performance of the simulations for a restricted number of processors that is lesser than the number of agents. On the other hand, Hitmap is a library to ease the programming of parallel applications based on distributed arrays. It includes abstractions for the automatic parti…

020203 distributed computingComputer scienceDistributed computingMessage passing0202 electrical engineering electronic engineering information engineeringProcess (computing)Reinforcement learning020207 software engineering02 engineering and technologyCrowd simulationGranularityPartition (database)
researchProduct

Towards Intelligent IoT Networks: Reinforcement Learning for Reliable Backscatter Communications

2019

Backscatter communication is becoming the focal point of research for low-powered Internet of things (IoT). However, the intelligence aspect of the backscattering devices is not well-defined. Since future IoT networks are going to be a formidable platform of intelligent sensing devices operating in a self-organizing manner, it is necessary to incorporate learning capabilities in backscatter devices. Motivated by this objective, this paper aims to employ reinforcement learning for improving the performance of backscatter networks. In particular, a multicluster backscatter communication model is developed for shortrange information sharing. This is followed by a power allocation algorithm usi…

0203 mechanical engineeringBackscatterComputer scienceInformation sharingDistributed computing0202 electrical engineering electronic engineering information engineeringReinforcement learning020302 automobile design & engineering020206 networking & telecommunications02 engineering and technologyCeiling (cloud)Interference (wave propagation)Power (physics)2019 IEEE Globecom Workshops (GC Wkshps)
researchProduct

Effect of the mutual position between weld seam and reinforcement on the residual stress distribution in Friction Stir Welding of AA6082 skin and str…

2016

Abstract In the paper, a numerical and experimental study was carried out to highlight the effect of the distance d between the weld seam and the reinforcement on the residual stress distribution in Friction Stir Welded AA6082-T6 structures. An L-shaped profile was welded to a sheet metal with varying tool rotation and distance d from the weld seam. The Cut Compliance method was used to determine the resulting longitudinal residual stress. A dedicated FE model for FSW was set up, validated and utilized to predict the longitudinal residual stress in the assembled part. The analysis allowed the identification of a few design guidelines in order to reduce the detrimental effects of the residua…

0209 industrial biotechnologyEngineeringFriction Stir WeldingResidual stre02 engineering and technologyWeldingRotationlaw.inventionWeld seam020901 industrial engineering & automation0203 mechanical engineeringStringerlawResidual stressSkin and stringerFriction stir weldingReinforcementSettore ING-IND/16 - Tecnologie E Sistemi Di LavorazioneCivil and Structural EngineeringCut compliancebusiness.industryMechanical EngineeringStructural engineeringBuilding and Construction020303 mechanical engineering & transportsvisual_artvisual_art.visual_art_mediumSheet metalbusinessFE analysi
researchProduct

Online fitted policy iteration based on extreme learning machines

2016

Reinforcement learning (RL) is a learning paradigm that can be useful in a wide variety of real-world applications. However, its applicability to complex problems remains problematic due to different causes. Particularly important among these are the high quantity of data required by the agent to learn useful policies and the poor scalability to high-dimensional problems due to the use of local approximators. This paper presents a novel RL algorithm, called online fitted policy iteration (OFPI), that steps forward in both directions. OFPI is based on a semi-batch scheme that increases the convergence speed by reusing data and enables the use of global approximators by reformulating the valu…

0209 industrial biotechnologyInformation Systems and ManagementRadial basis function networkArtificial neural networkComputer sciencebusiness.industryStability (learning theory)02 engineering and technologyMachine learningcomputer.software_genreManagement Information Systems020901 industrial engineering & automationArtificial IntelligenceBellman equation0202 electrical engineering electronic engineering information engineeringBenchmark (computing)Reinforcement learning020201 artificial intelligence & image processingArtificial intelligencebusinesscomputerSoftwareExtreme learning machineKnowledge-Based Systems
researchProduct

Using Inverse Reinforcement Learning with Real Trajectories to Get More Trustworthy Pedestrian Simulations

2020

Reinforcement learning is one of the most promising machine learning techniques to get intelligent behaviors for embodied agents in simulations. The output of the classic Temporal Difference family of Reinforcement Learning algorithms adopts the form of a value function expressed as a numeric table or a function approximator. The learned behavior is then derived using a greedy policy with respect to this value function. Nevertheless, sometimes the learned policy does not meet expectations, and the task of authoring is difficult and unsafe because the modification of one value or parameter in the learned value function has unpredictable consequences in the space of the policies it represents…

0209 industrial biotechnologyreinforcement learningComputer scienceGeneral Mathematics02 engineering and technologypedestrian simulationTask (project management)learning by demonstration020901 industrial engineering & automationAprenentatgeInformàticaBellman equation0202 electrical engineering electronic engineering information engineeringComputer Science (miscellaneous)Reinforcement learningEngineering (miscellaneous)business.industrycausal entropylcsh:MathematicsProcess (computing)020206 networking & telecommunicationsFunction (mathematics)inverse reinforcement learninglcsh:QA1-939Problem domainTable (database)Artificial intelligenceTemporal difference learningbusinessoptimizationMathematics
researchProduct

2019

As rats learn to search for multiple sources of food or water in a complex environment, they generate increasingly efficient trajectories between reward sites. Such spatial navigation capacity involves the replay of hippocampal place-cells during awake states, generating small sequences of spatially related place-cell activity that we call "snippets". These snippets occur primarily during sharp-wave-ripples (SWRs). Here we focus on the role of such replay events, as the animal is learning a traveling salesperson task (TSP) across multiple trials. We hypothesize that snippet replay generates synthetic data that can substantially expand and restructure the experience available and make learni…

0301 basic medicineComputer sciencePlace cellMachine learningcomputer.software_genreSpatial memorySynthetic data03 medical and health sciencesCellular and Molecular Neuroscience0302 clinical medicineModels of neural computationGeneticsReinforcement learningMolecular BiologyEcology Evolution Behavior and SystematicsEcologybusiness.industryReservoir computingSnippet030104 developmental biologyComputational Theory and MathematicsModeling and SimulationSequence learningArtificial intelligencebusinesscomputer030217 neurology & neurosurgeryPLOS Computational Biology
researchProduct

Modelos animales de adicción a las drogas

2017

El desarrollo de modelos animales de refuerzo y adicción a las drogas es imprescindible para el avance en el conocimiento de las bases biológicas de este trastorno y la identificación de nuevas dianas terapéuticas. En función del componente del refuerzo que deseemos estudiar podemos servirnos de un tipo de modelos animales u otros. Podemos utilizar modelos de refuerzo basados en el efecto hedónico primario que produce el consumo de la sustancia adictiva, como los modelos de autoadministración (AA) y autoestimulación eléctrica intracraneal (AEIC), o modelos basados en el componente relacionado con el aprendizaje asociativo y la capacidad cognitiva de realizar predicciones sobre la obtención …

0301 basic medicinePunishment (psychology)media_common.quotation_subjectMedicine (miscellaneous)03 medical and health sciencesTratamiento médicoLucha contra la toxicomanía0302 clinical medicinemedicineReinforcementmedia_commonToxicomaníaComportamientoAddictionConductaCognitionExtinction (psychology)medicine.diseaseConditioned place preferenceAssociative learningPsychiatry and Mental health030104 developmental biologyPsychologyAddictive behaviorSocial psychology030217 neurology & neurosurgery
researchProduct

Introducing Clicker Training as a Cognitive Enrichment for Laboratory Mice

2017

Establishing new refinement strategies in laboratory animal science is a central goal in fulfilling the requirements of Directive 2010/63/EU. Previous research determined a profound impact of gentle handling protocols on the well-being of laboratory mice. By introducing clicker training to the keeping of mice, not only do we promote the amicable treatment of mice, but we also enable them to experience cognitive enrichment. Clicker training is a form of positive reinforcement training using a conditioned secondary reinforcer, the "click" sound of a clicker, which serves as a time bridge between the strengthened behavior and an upcoming reward. The effective implementation of the clicker trai…

0301 basic medicinemedicine.medical_specialtyGeneral Chemical EngineeringMale miceAudiologyGeneral Biochemistry Genetics and Molecular BiologyMice03 medical and health sciencesCognitionLaboratory Animal SciencemedicineAnimalsReinforcementDaily routineBehaviorBehavior AnimalGeneral Immunology and MicrobiologyGeneral NeuroscienceCognitionFearClicker trainingClicker030104 developmental biologyModels AnimalPsychologyReinforcement PsychologyJournal of Visualized Experiments
researchProduct