Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

58

ICML
1995
IEEE

favoriteEmaildiscussreport

184views Machine Learning» more ICML 1995»

Residual Algorithms: Reinforcement Learning with Function Approximation

15 years 5 months ago

Residual Algorithms: Reinforcement Learning with Function Approximation

Download www.leemon.com

A number of reinforcement learning algorithms have been developed that are guaranteed to converge to the optimal solution when used with lookup tables. It is shown, however, that these algorithms can easily become unstable when implemented directly with a general function-approximation system, such as a sigmoidal multilayer perceptron, a radial-basisfunction system, a memory-based learning system, or even a linear function-approximation system. A new class of algorithms, residual gradient algorithms, is proposed, which perform gradient descent on the mean squared Bellman residual, guaranteeing convergence. I shown, however, that they may learn very slowly in some cases. A larger class of algorithms, residual algorithms, is proposed that has the guaranteed convergence of the residual gradient algorithms, yet can retain the fast learning speed of direct algorithms. In fact, both direct and residual gradient algorithms are shown to be special cases of residual algorithms, and it is shown...

Leemon C. Baird III

Real-time Traffic

ICML 1995 | Machine Learning | Reinforcement Learning Algorithms | Residual Algorithms | Residual Gradient Algorithms |

claim paper

Related Content

» Convergence of synchronous reinforcement learning with linear function approximation

» TD0 Converges Provably Faster than the Residual Gradient Algorithm

» Basis Function Construction in Reinforcement Learning Using CascadeCorrelation Learning Ar...

» OffPolicy Temporal Difference Learning with Function Approximation

» Regularized Policy Iteration

» Tracking value function dynamics to improve reinforcement learning with piecewise linear f...

» A Convergent Reinforcement Learning Algorithm in the Continuous Case The FiniteElement Rei...

» A worstcase comparison between temporal difference and residual gradient with linear funct...

» Global Versus Local Constructive Function Approximation for OnLine Reinforcement Learning

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	1995
Where	ICML
Authors	Leemon C. Baird III

Comments (0)