High-accuracy value-function approximation with neural networks applied to the acrobot

15 years 8 months ago

Download remi.coulom.free.fr

Several reinforcement-learning techniques have already been applied to the Acrobot control problem, using linear function approximators to estimate the value function. In this paper, we present experimental results obtained by using a feedforward neural network instead. The learning algorithm used was model-based continuous TD(). It generated an efficient controller, producing a high-accuracy state-value function. A striking feature of this value function is a very sharp 4-dimensional ridge that is extremely hard to evaluate with linear parametric approximators. From a broader point of view, this experimental success demonstrates some of the qualities of feedforward neural networks in comparison with linear approximators in reinforcement learning.

Rémi Coulom

Real-time Traffic

ESANN 2004 | ESANN 2007 | Feedforward Neural Networks | Linear Function Approximators | Value Function |

claim paper

» Efficient Reinforcement Learning Using Recursive LeastSquares Methods

» Learning Evaluation Functions for Large Acyclic Domains

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2004
Where	ESANN
Authors	Rémi Coulom

Comments (0)

Sciweavers

High-accuracy value-function approximation with neural networks applied to the acrobot

ESANN 2004 | ESANN 2007 | Feedforward Neural Networks | Linear Function Approximators | Value Function |

Explore & Download

Productivity Tools

Sciweavers