TD(0) Converges Provably Faster than the Residual Gradient Algorithm

16 years 7 months ago

Download www.hpl.hp.com

In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the concept of asymptotic convergence rate to prove that under certain conditions the synchronous off-policy TD(0) algorithm converges faster than the synchronous offpolicy residual gradient algorithm if the value function is represented in tabular form. This is the first theoretical result comparing the convergence behaviour of two RL algorithms. We also show that as soon as linear function approximation is involved no general statement concerning the superiority of one of the algorithms can be made.

Ralf Schoknecht, Artur Merke

Real-time Traffic

Asymptotic Convergence Rate | ICML 2003 | Machine Learning | Residual Gradient Algorithm | RL Algorithms |

claim paper

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2003
Where	ICML
Authors	Ralf Schoknecht, Artur Merke

Comments (0)

Sciweavers

TD(0) Converges Provably Faster than the Residual Gradient Algorithm

Asymptotic Convergence Rate | ICML 2003 | Machine Learning | Residual Gradient Algorithm | RL Algorithms |

Explore & Download

Productivity Tools

Sciweavers