Sciweavers

EWRL
2008

Regularized Fitted Q-Iteration: Application to Planning

14 years 1 months ago
Regularized Fitted Q-Iteration: Application to Planning
We consider planning in a Markovian decision problem, i.e., the problem of finding a good policy given access to a generative model of the environment. We propose to use fitted Q-iteration with penalized (or regularized) least-squares regression as the regression subroutine to address the problem of controlling model-complexity. The algorithm is presented in detail for the case when the function space is a reproducingkernel Hilbert space underlying a user-chosen kernel function. We derive bounds on the quality of the solution and argue that data-dependent penalties can lead to almost optimal performance. A simple example is used to illustrate the benefits of using a penalized procedure.
Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab
Added 19 Oct 2010
Updated 19 Oct 2010
Type Conference
Year 2008
Where EWRL
Authors Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor
Comments (0)