6533b862fe1ef96bd12c61b5
RESEARCH PRODUCT
Kernelizing LSPE(λ)
T. JungDaniel Polanisubject
Mathematical optimizationKernel (statistics)KernelizationLeast squares support vector machineBenchmark (computing)Reinforcement learningContext (language use)Basis functionFunction (mathematics)Mathematicsdescription
We propose the use of kernel-based methods as underlying function approximator in the least-squares based policy evaluation framework of LSPE(λ) and LSTD(λ). In particular we present the 'kernelization' of model-free LSPE(λ). The 'kernelization' is computationally made possible by using the subset of regressors approximation, which approximates the kernel using a vastly reduced number of basis functions. The core of our proposed solution is an efficient recursive implementation with automatic supervised selection of the relevant basis functions. The LSPE method is well-suited for optimistic policy iteration and can thus be used in the context of online reinforcement learning. We use the high-dimensional Octopus benchmark to demonstrate this
year | journal | country | edition | language |
---|---|---|---|---|
2007-04-01 | 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning |