Sciweavers

50 search results - page 3 / 10
» Learning Policies with External Memory
Sort
View
ICANN
2010
Springer
13 years 10 months ago
Multi-Dimensional Deep Memory Atari-Go Players for Parameter Exploring Policy Gradients
Abstract. Developing superior artificial board-game players is a widelystudied area of Artificial Intelligence. Among the most challenging games is the Asian game of Go, which, des...
Mandy Grüttner, Frank Sehnke, Tom Schaul, J&u...
ICANN
2007
Springer
14 years 4 months ago
Solving Deep Memory POMDPs with Recurrent Policy Gradients
Abstract. This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov...
Daan Wierstra, Alexander Förster, Jan Peters,...
NCI
2004
185views Neural Networks» more  NCI 2004»
13 years 11 months ago
Hierarchical reinforcement learning with subpolicies specializing for learned subgoals
This paper describes a method for hierarchical reinforcement learning in which high-level policies automatically discover subgoals, and low-level policies learn to specialize for ...
Bram Bakker, Jürgen Schmidhuber
ECML
2007
Springer
14 years 4 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber
ECML
2005
Springer
14 years 3 months ago
Model-Based Online Learning of POMDPs
Abstract. Learning to act in an unknown partially observable domain is a difficult variant of the reinforcement learning paradigm. Research in the area has focused on model-free m...
Guy Shani, Ronen I. Brafman, Solomon Eyal Shimony