Search Sciweavers | Sciweavers

49 search results - page 4 / 10

» Temporal Difference and Policy Search Methods for Reinforcem...

click to vote

ATAL
2008
Springer

123views Intelligent Agents» more ATAL 2008»

Sigma point policy iteration

13 years 9 months ago

Download web.mit.edu

In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...

Michael H. Bowling, Alborz Geramifard, David Winga...

claim paper

Read More »

click to vote

ICML
2003
IEEE

150views Machine Learning» more ICML 2003»

The Significance of Temporal-Difference Learning in Self-Play Training TD-Rummy versus EVO-rummy

14 years 22 days ago

Download www.hpl.hp.com

Reinforcement learning has been used for training game playing agents. The value function for a complex game must be approximated with a continuous function because the number of ...

Clifford Kotnik, Jugal K. Kalita

claim paper

Read More »

click to vote

CDC
2010
IEEE

136views Control Systems» more CDC 2010»

Pathologies of temporal difference methods in approximate dynamic programming

13 years 2 months ago

Download web.mit.edu

Approximate policy iteration methods based on temporal differences are popular in practice, and have been tested extensively, dating to the early nineties, but the associated conve...

Dimitri P. Bertsekas

claim paper

Read More »

click to vote

ICML
2010
IEEE

222views Machine Learning» more ICML 2010»

Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda

13 years 5 months ago

Download www.icml2010.org

Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...

Carlton Downey, Scott Sanner

claim paper

Read More »

click to vote

CORR
2010
Springer

204views Education» more CORR 2010»

Predictive State Temporal Difference Learning

13 years 6 months ago

Download www.cs.cmu.edu

We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identiﬁcation. In practical applications...

Byron Boots, Geoffrey J. Gordon

claim paper

Read More »

« Prev « First page 4 / 10 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers