6533b85efe1ef96bd12c05e3

RESEARCH PRODUCT

Agent's actions as a classification criteria for the state space in a learning from rewards system

Francisco Martinez-gil

subject

Error-driven learningComputer sciencebusiness.industryFeature vectorAutonomous agentDecision ruleTrial and errorcomputer.software_genreMachine learningTheoretical Computer ScienceIntelligent agentArtificial IntelligenceVisual navigation systemArtificial intelligencebusinessClassifier (UML)computerSoftware

description

We focus in this paper on the problem of learning an autonomous agent's policy when the state space is very large and the set of actions available is comparatively short. To this end, we use a non-parametric decision rule (concretely, a nearest-neighbour strategy) in order to cluster the state space by means of the action that leads to a successful situation. Using an exploration strategy to avoid greedy behaviour, the agent builds clusters of positively-classified states through trial and error learning. In this paper, we implement a 3D synthetic agent which plays an 'avoid the asteroid' game that suits our assumptions. Using as the state space a feature vector space extracted from a visual navigation system, we test two exploration strategies using the trial and error learning method. This experiment shows that the agent is a good classifier over the state space, and will therefore show good behaviour in its synthetic world.

https://doi.org/10.1080/09528130701538190