Adaptive bases for Q-learning

13 years 10 months ago

Download webee.technion.ac.il

Abstract-- We consider reinforcement learning, and in particular, the Q-learning algorithm in large state and action spaces. In order to cope with the size of the spaces, a function approximation approach to the state and action value function is needed. We generalize the classical Q-learning algorithm to an algorithm where the basis of the linear function approximation change dynamically while interacting with the environment. A motivation for such an approach is maximizing the state-action value function fitness to the problem faced, thus obtaining better performance. The algorithm is shown to converge using two time scales stochastic approximation. Finally, we discuss how this technique can be applied to a rich family of RL algorithms with linear function approximation.

Dotan Di Castro, Shie Mannor

Real-time Traffic

Algorithms | CDC 2010 | Control Systems | Function Approximation | Linear Function Approximation |

claim paper

Post Info
More Details (n/a)

Added	16 May 2011
Updated	16 May 2011
Type	Journal
Year	2010
Where	CDC
Authors	Dotan Di Castro, Shie Mannor

Comments (0)

Sciweavers

Adaptive bases for Q-learning

Algorithms | CDC 2010 | Control Systems | Function Approximation | Linear Function Approximation |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers