Sciweavers

223 search results - page 11 / 45
» Least-Squares Temporal Difference Learning
Sort
View
NECO
2010
52views more  NECO 2010»
13 years 7 months ago
Hyperbolically Discounted Temporal Difference Learning
William H. Alexander, Joshua W. Brown
WCE
2007
13 years 9 months ago
Feature Reconstruction for Face Recognition Based on Sample Image Learning
—Pose problem is a big challenge for applying face recognition technology under real world conditions. In this paper, appearance based approach was proposed to recognize face acr...
Hongzhou Zhang, Yongping Li, Lin Wang, Chengbo Wan...
ICML
2008
IEEE
14 years 9 months ago
A worst-case comparison between temporal difference and residual gradient with linear function approximation
Residual gradient (RG) was proposed as an alternative to TD(0) for policy evaluation when function approximation is used, but there exists little formal analysis comparing them ex...
Lihong Li
ICML
2010
IEEE
13 years 6 months ago
Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda
Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...
Carlton Downey, Scott Sanner
COLT
2000
Springer
14 years 1 months ago
Bias-Variance Error Bounds for Temporal Difference Updates
We give the first rigorous upper bounds on the error of temporal difference (td) algorithms for policy evaluation as a function of the amount of experience. These upper bounds pr...
Michael J. Kearns, Satinder P. Singh