0000000001178862
AUTHOR
Charl Maree
Reinforcement Learning Your Way: Agent Characterization through Policy Regularization
The increased complexity of state-of-the-art reinforcement learning (RL) algorithms has resulted in an opacity that inhibits explainability and understanding. This has led to the development of several post hoc explainability methods that aim to extract information from learned policies, thus aiding explainability. These methods rely on empirical observations of the policy, and thus aim to generalize a characterization of agents’ behaviour. In this study, we have instead developed a method to imbue agents’ policies with a characteristic behaviour through regularization of their objective functions. Our method guides the agents’ behaviour during learning, which results in a…
Reinforcement Learning with Intrinsic Affinity for Personalized Prosperity Management
AbstractThe purpose of applying reinforcement learning (RL) to portfolio management is commonly the maximization of profit. The extrinsic reward function used to learn an optimal strategy typically does not take into account any other preferences or constraints. We have developed a regularization method that ensures that strategies have global intrinsic affinities, i.e., different personalities may have preferences for certain asset classes which may change over time. We capitalize on these intrinsic policy affinities to make our RL model inherently interpretable. We demonstrate how RL agents can be trained to orchestrate such individual policies for particular personality profiles and stil…
Can Interpretable Reinforcement Learning Manage Prosperity Your Way?
Personalisation of products and services is fast becoming the driver of success in banking and commerce. Machine learning holds the promise of gaining a deeper understanding of and tailoring to customers’ needs and preferences. Whereas traditional solutions to financial decision problems frequently rely on model assumptions, reinforcement learning is able to exploit large amounts of data to improve customer modelling and decision-making in complex financial environments with fewer assumptions. Model explainability and interpretability present challenges from a regulatory perspective which demands transparency for acceptance; they also offer the opportunity for improved insight into and unde…
Clustering in Recurrent Neural Networks for Micro-Segmentation using Spending Personality
Author's accepted manuscript. © 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Customer segmentation has long been a productive field in banking. However, with new approaches to traditional problems come new opportunities. Fine-grained customer segments are notoriously elusive and one method of obtaining them is through feature extraction. It is possible to assi…
Towards Responsible AI for Financial Transactions
Author's accepted manuscript. © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The application of AI in finance is increasingly dependent on the principles of responsible AI. These principles-explainability, fairness, privacy, accountability, transparency and soundness form the basis for trust in future AI systems. In this empirical study, we address the first p…
Affinity-Based Reinforcement Learning : A New Paradigm for Agent Interpretability
The steady increase in complexity of reinforcement learning (RL) algorithms is accompanied by a corresponding increase in opacity that obfuscates insights into their devised strategies. Methods in explainable artificial intelligence seek to mitigate this opacity by either creating transparent algorithms or extracting explanations post hoc. A third category exists that allows the developer to affect what agents learn: constrained RL has been used in safety-critical applications and prohibits agents from visiting certain states; preference-based RL agents have been used in robotics applications and learn state-action preferences instead of traditional reward functions. We propose a new affini…
Balancing Profit, Risk, and Sustainability for Portfolio Management
Author's accepted manuscript © 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Stock portfolio optimization is the process of continuous reallocation of funds to a selection of stocks. This is a particularly well-suited problem for reinforcement learning, as daily rewards are compounding and objective functions may include more than just profit, e.g., risk and su…