6533b85ffe1ef96bd12c1a4b

RESEARCH PRODUCT

A Deep Reinforcement Learning scheme for Battery Energy Management

Morten GoodwinLei JiaoMohan KolheSven Myrdahl OpalicHenrik Kofoed Nielsen

subject

Reduction (complexity)Task (computing)Mathematical optimizationArtificial neural networkComputer sciencebusiness.industryDeep learningStability (learning theory)Reinforcement learningContext (language use)Artificial intelligencebusinessAverage cost

description

Deep reinforcement learning is considered promising for many energy cost optimization tasks in smart buildings. How-ever, agent learning, in this context, is sometimes unstable and unpredictable, especially when the environments are complex. In this paper, we examine deep Reinforcement Learning (RL) algorithms developed for game play applied to a battery control task with an energy cost optimization objective. We explore how agent behavior and hyperparameters can be analyzed in a simplified environment with the goal of modifying the algorithms for increased stability. Our modified Deep Deterministic Policy Gradient (DDPG) agent is able to perform consistently close to the optimum over multiple training sessions with a maximum cost reduction of 25 % and an average cost reduction of 99 % of the maximum in the simplified environment. DDPG is an actor-critic RL algorithm consisting of four neural networks - the actor and critic, main and target, networks. When environment complexity is increased, the DDPG agent performance decreases and a modified Twin Delayed DDPG (TD3) agent is utilized to achieve an average of 99.9 % of the optimal result. The TD3 algorithm uses two main critic networks to avoid known value overestimation bias.

https://doi.org/10.23919/splitech49282.2020.9243797