Hierarchical reinforcement learning with subpolicies specializing for learned subgoals

14 years 1 months ago

Download staff.science.uva.nl

This paper describes a method for hierarchical reinforcement learning in which high-level policies automatically discover subgoals, and low-level policies learn to specialize for different subgoals. Subgoals are represented as destract observations which cluster raw input data. High-level value functions cover the state space at a coarse level; low-level value functions cover only parts of the state space at a fine-grained level. An experiment shows that this method outperforms several flat reinforcement learning methods. A second experiment shows how problems of observability due to observation abstraction can be overcome using high-level policies with memory. Key words Reinforcement learning, hierarchical reinforcement learning, feedforward neural networks, recurrent neural networks, MDPs, POMDPs, short-term memory

Bram Bakker, Jürgen Schmidhuber

Real-time Traffic

Hierarchical Reinforcement | NCI 2004 | Neural Networks | Reinforcement Learning | Reinforcement Learning Methods |

claim paper

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2004
Where	NCI
Authors	Bram Bakker, Jürgen Schmidhuber

Comments (0)

Sciweavers

Hierarchical reinforcement learning with subpolicies specializing for learned subgoals

Hierarchical Reinforcement | NCI 2004 | Neural Networks | Reinforcement Learning | Reinforcement Learning Methods |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers