6533b85bfe1ef96bd12ba9ad

RESEARCH PRODUCT

A Comparative Analysis of Multiple Biasing Techniques for $Q_{biased}$ Softmax Regression Algorithm

Muhammad MoizHazique MalikNoman NaseerMuhammad Bilal

subject

business.industryComputer scienceObstacle avoidanceSoftmax functionQ-learningRobotReinforcement learningMobile robotArtificial intelligencebusinessTrial and errorAction selection

description

Over the past many years the popularity of robotic workers has seen a tremendous surge. Several tasks which were previously considered insurmountable are able to be performed by robots efficiently, with much ease. This is mainly due to the advances made in the field of control systems and artificial intelligence in recent years. Lately, we have seen Reinforcement Learning (RL) capture the spotlight, in the field of robotics. Instead of explicitly specifying the solution of a particular task, RL enables the robot (agent) to explore its environment and through trial and error choose the appropriate response. In this paper, a comparative analysis of biasing techniques for the Q-biased softmax regression (QBIASSR) algorithm has been presented. In QBIASSR, decision-making for un-explored states depends upon the set of previously explored states. This algorithm improves the learning process when the robot reaches unexplored states. A vector bias(s) is calculated on the basis of variable values of experienced states and added to the Q-value function for action selection. To obtain the optimized reward, different techniques to calculate bias(s) are adopted. The performance of all the techniques has been evaluated and compared for obstacle avoidance in the case of a mobile robot. In the end, we have demonstrated that the cumulative reward generated by the technique proposed in our paper is at least 2 times greater than the baseline.

https://doi.org/10.1109/aims52415.2021.9466049