Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

174

ML
2002
ACM

154views Machine Learning» more ML 2002»

Technical Update: Least-Squares Temporal Difference Learning

15 years 6 months ago

Technical Update: Least-Squares Temporal Difference Learning

Download www.research.rutgers.edu

TD() is a popular family of algorithms for approximate policy evaluation in large MDPs. TD() works by incrementally updating the value function after each observed transition. It has two major drawbacks: it may make inefficient use of data, and it requires the user to manually tune a stepsize schedule for good performance. For the case of linear value function approximations and = 0, the Least-Squares TD (LSTD) algorithm of Bradtke and Barto (1996, Machine learning, 22:1

Justin A. Boyan

Real-time Traffic

Algorithm | Approximate Policy Evaluation | Machine Learning | ML 2002 | Value Function |

claim paper

Related Content

» LeastSquares Temporal Difference Learning

» Least Squares SVM for Least Squares TD Learning

» Incremental LeastSquares Temporal Difference Learning

» Convergence of Least Squares Temporal Difference Methods Under General Conditions

» ModelFree LeastSquares Policy Iteration

» Recursive least squares dictionary learning algorithm

» Least Square Incremental Linear Discriminant Analysis

» Partial least squares regression for graph mining

» Regularization and feature selection in leastsquares temporal difference learning

Post Info
More Details (n/a)

Added	22 Dec 2010
Updated	22 Dec 2010
Type	Journal
Year	2002
Where	ML
Authors	Justin A. Boyan

Comments (0)