6533b81ffe1ef96bd1277a04
RESEARCH PRODUCT
Realizing Undelayed N-step TD prediction with neural networks
Janis Zuterssubject
Dynamic programmingArtificial neural networkComputer sciencebusiness.industryValue (computer science)Reinforcement learningObservableExtension (predicate logic)Artificial intelligencebusinessdescription
There exist various techniques to extend reinforcement learning algorithms, e.g., eligibility traces and planning. In this paper, an approach is proposed, which combines several extension techniques, such as using eligibility-like traces, using approximators as value functions and exploiting the model of the environment. The obtained method, ‘Undelayed n-step TD prediction’ (TD-P), has produced competitive results when put in conditions of not fully observable environment.
year | journal | country | edition | language |
---|---|---|---|---|
2010-01-01 | Melecon 2010 - 2010 15th IEEE Mediterranean Electrotechnical Conference |