Sciweavers

ECML
2005
Springer

Towards Finite-Sample Convergence of Direct Reinforcement Learning

14 years 5 months ago
Towards Finite-Sample Convergence of Direct Reinforcement Learning
Abstract. While direct, model-free reinforcement learning often performs better than model-based approaches in practice, only the latter have yet supported theoretical guarantees for finite-sample convergence. A major difficulty in analyzing the direct approach in an online setting is the absence of a definitive exploration strategy. We extend the notion of admissibility to direct reinforcement learning and show that standard Q-learning with optimistic initial values and constant learning rate is admissible. The notion justifies the use of a greedy strategy that we believe performs very well in practice and holds theoretical significance in deriving finite-sample convergence for direct reinforcement learning. We present empirical evidence that supports our idea.
Shiau Hong Lim, Gerald DeJong
Added 27 Jun 2010
Updated 27 Jun 2010
Type Conference
Year 2005
Where ECML
Authors Shiau Hong Lim, Gerald DeJong
Comments (0)