Sciweavers

NCI
2004

Hierarchical reinforcement learning with subpolicies specializing for learned subgoals

14 years 1 months ago
Hierarchical reinforcement learning with subpolicies specializing for learned subgoals
This paper describes a method for hierarchical reinforcement learning in which high-level policies automatically discover subgoals, and low-level policies learn to specialize for different subgoals. Subgoals are represented as destract observations which cluster raw input data. High-level value functions cover the state space at a coarse level; low-level value functions cover only parts of the state space at a fine-grained level. An experiment shows that this method outperforms several flat reinforcement learning methods. A second experiment shows how problems of observability due to observation abstraction can be overcome using high-level policies with memory. Key words Reinforcement learning, hierarchical reinforcement learning, feedforward neural networks, recurrent neural networks, MDPs, POMDPs, short-term memory
Bram Bakker, Jürgen Schmidhuber
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2004
Where NCI
Authors Bram Bakker, Jürgen Schmidhuber
Comments (0)