Search Sciweavers | Sciweavers

17 search results - page 1 / 4

» Fast gradient-descent methods for temporal-difference learni...

195

click to vote

ICML
2001
IEEE

185views Machine Learning» more ICML 2001»

Off-Policy Temporal Difference Learning with Function Approximation

16 years 7 months ago

Download www.cs.ualberta.ca

We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...

Doina Precup, Richard S. Sutton, Sanjoy Dasgupta

claim paper

Read More »

189

click to vote

NIPS
2007

164views Information Technology» more NIPS 2007»

Incremental Natural Actor-Critic Algorithms

15 years 8 months ago

Download books.nips.cc

We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...

Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...

claim paper

Read More »

223

click to vote

CORR
2010
Springer

204views Education» more CORR 2010»

Predictive State Temporal Difference Learning

15 years 5 months ago

Download www.cs.cmu.edu

We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identiﬁcation. In practical applications...

Byron Boots, Geoffrey J. Gordon

claim paper

Read More »

189

click to vote

ICML
2009
IEEE

186views Machine Learning» more ICML 2009»

Regularization and feature selection in least-squares temporal difference learning

16 years 7 months ago

Download ai.stanford.edu

We consider the task of reinforcement learning with linear value function approximation. Temporal difference algorithms, and in particular the Least-Squares Temporal Difference (L...

J. Zico Kolter, Andrew Y. Ng

claim paper

Read More »

192

click to vote

ICML
1995
IEEE

184views Machine Learning» more ICML 1995»

Residual Algorithms: Reinforcement Learning with Function Approximation

16 years 7 months ago

Download www.leemon.com

A number of reinforcement learning algorithms have been developed that are guaranteed to converge to the optimal solution when used with lookup tables. It is shown, however, that ...

Leemon C. Baird III

claim paper

Read More »

« Prev « First page 1 / 4 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers