Sciweavers

133 search results - page 4 / 27
» Hierarchical Policy Gradient Algorithms
Sort
View
ICMLA
2010
13 years 7 months ago
Multimodal Parameter-exploring Policy Gradients
Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...
ICRA
2005
IEEE
159views Robotics» more  ICRA 2005»
14 years 3 months ago
Learning Sensory Feedback to CPG with Policy Gradient for Biped Locomotion
— This paper proposes a learning framework for a CPG-based biped locomotion controller using a policy gradient method. Our goal in this study is to develop an efficient learning...
Takamitsu Matsubara, Jun Morimoto, Jun Nakanishi, ...
AAAI
2010
13 years 11 months ago
Multi-Agent Learning with Policy Prediction
Due to the non-stationary environment, learning in multi-agent systems is a challenging problem. This paper first introduces a new gradient-based learning algorithm, augmenting th...
Chongjie Zhang, Victor R. Lesser
IROS
2006
IEEE
113views Robotics» more  IROS 2006»
14 years 3 months ago
Policy Gradient Methods for Robotics
— The aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely pre-struc...
Jan Peters, Stefan Schaal
ICANN
2010
Springer
13 years 10 months ago
Multi-Dimensional Deep Memory Atari-Go Players for Parameter Exploring Policy Gradients
Abstract. Developing superior artificial board-game players is a widelystudied area of Artificial Intelligence. Among the most challenging games is the Asian game of Go, which, des...
Mandy Grüttner, Frank Sehnke, Tom Schaul, J&u...