Regularized Fitted Q-Iteration: Application to Planning

14 years 2 months ago

Download eprints.pascal-network.org

We consider planning in a Markovian decision problem, i.e., the problem of finding a good policy given access to a generative model of the environment. We propose to use fitted Q-iteration with penalized (or regularized) least-squares regression as the regression subroutine to address the problem of controlling model-complexity. The algorithm is presented in detail for the case when the function space is a reproducingkernel Hilbert space underlying a user-chosen kernel function. We derive bounds on the quality of the solution and argue that data-dependent penalties can lead to almost optimal performance. A simple example is used to illustrate the benefits of using a penalized procedure.

Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab

Real-time Traffic

EWRL 2008 | Machine Learning | Markovian Decision Problem | Policy Given Access | User-chosen Kernel Function |

claim paper

Post Info
More Details (n/a)

Added	19 Oct 2010
Updated	19 Oct 2010
Type	Conference
Year	2008
Where	EWRL
Authors	Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csaba Szepesvári, Shie Mannor

Comments (0)

Sciweavers

Regularized Fitted Q-Iteration: Application to Planning

EWRL 2008 | Machine Learning | Markovian Decision Problem | Policy Given Access | User-chosen Kernel Function |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers