Sciweavers

272 search results - page 14 / 55
» Parallel Reinforcement Learning with Linear Function Approxi...
Sort
View
ICML
2005
IEEE
14 years 8 months ago
Proto-value functions: developmental reinforcement learning
This paper presents a novel framework called proto-reinforcement learning (PRL), based on a mathematical model of a proto-value function: these are task-independent basis function...
Sridhar Mahadevan
ICML
2008
IEEE
14 years 8 months ago
Sample-based learning and search with permanent and transient memories
We present a reinforcement learning architecture, Dyna-2, that encompasses both samplebased learning and sample-based search, and that generalises across states during both learni...
David Silver, Martin Müller 0003, Richard S. ...
PKDD
2009
Springer
144views Data Mining» more  PKDD 2009»
14 years 2 months ago
Compositional Models for Reinforcement Learning
Abstract. Innovations such as optimistic exploration, function approximation, and hierarchical decomposition have helped scale reinforcement learning to more complex environments, ...
Nicholas K. Jong, Peter Stone
ICML
2008
IEEE
14 years 8 months ago
A worst-case comparison between temporal difference and residual gradient with linear function approximation
Residual gradient (RG) was proposed as an alternative to TD(0) for policy evaluation when function approximation is used, but there exists little formal analysis comparing them ex...
Lihong Li
ICML
2003
IEEE
14 years 8 months ago
TD(0) Converges Provably Faster than the Residual Gradient Algorithm
In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the ...
Ralf Schoknecht, Artur Merke