Emergent behaviors and scalability for multi-agent reinforcement learning-based pedestrian models

6533b873fe1ef96bd12d4f2a

RESEARCH PRODUCT

Emergent behaviors and scalability for multi-agent reinforcement learning-based pedestrian models

Miguel Lozano Francisco Martinez-gil Fernando Fernández

subject

Engineering media_common.quotation_subject 02 engineering and technology Pedestrian Machine learning computer.software_genre Consistency (database systems)Robustness (computer science)0202 electrical engineering electronic engineering information engineering Reinforcement learning Quality (business)Macro media_common Informática Pedestrian simulation and modeling Kinematic controller business.industry 020207 software engineering Emergent behaviours Behavioural simulation Hardware and Architecture Modeling and Simulation Scalability 020201 artificial intelligence & image processing Artificial intelligence business Multi-agent reinforcement learning (Marl)computer Software

description

This paper analyzes the emergent behaviors of pedestrian groups that learn through the multiagent reinforcement learning model developed in our group. Five scenarios studied in the pedestrian model literature, and with different levels of complexity, were simulated in order to analyze the robustness and the scalability of the model. Firstly, a reduced group of agents must learn by interaction with the environment in each scenario. In this phase, each agent learns its own kinematic controller, that will drive it at a simulation time. Secondly, the number of simulated agents is increased, in each scenario where agents have previously learnt, to test the appearance of emergent macroscopic behaviors without additional learning. This strategy allows us to evaluate the robustness and the consistency and quality of the learned behaviors. For this purpose several tools from pedestrian dynamics, such as fundamental diagrams and density maps, are used. The results reveal that the developed model is capable of simulating human-like micro and macro pedestrian behaviors for the simulation scenarios studied, including those where the number of pedestrians has been scaled by one order of magnitude with respect to the situation learned. This work has been supported by grant TIN2015-65686-C5-1-R of Ministerio de Economía y Competitividad.

year	journal	country	edition	language
2017-05-01	Simulation Modelling Practice and Theory

https://doi.org/10.1016/j.simpat.2017.03.003