Sciweavers

95 search results - page 6 / 19
» Policy Gradients for Cryptanalysis
Sort
View
NIPS
2008
13 years 9 months ago
Particle Filter-based Policy Gradient in POMDPs
Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the bel...
Pierre-Arnaud Coquelin, Romain Deguest, Rém...
ICRA
2005
IEEE
159views Robotics» more  ICRA 2005»
14 years 1 months ago
Learning Sensory Feedback to CPG with Policy Gradient for Biped Locomotion
— This paper proposes a learning framework for a CPG-based biped locomotion controller using a policy gradient method. Our goal in this study is to develop an efficient learning...
Takamitsu Matsubara, Jun Morimoto, Jun Nakanishi, ...
ICANN
2010
Springer
13 years 7 months ago
Multi-Dimensional Deep Memory Atari-Go Players for Parameter Exploring Policy Gradients
Abstract. Developing superior artificial board-game players is a widelystudied area of Artificial Intelligence. Among the most challenging games is the Asian game of Go, which, des...
Mandy Grüttner, Frank Sehnke, Tom Schaul, J&u...
IROS
2007
IEEE
123views Robotics» more  IROS 2007»
14 years 1 months ago
An extended policy gradient algorithm for robot task learning
Andrea Cherubini, Francesca Giannone, Luca Iocchi,...
AIPS
2007
13 years 10 months ago
FF + FPG: Guiding a Policy-Gradient Planner
Olivier Buffet, Douglas Aberdeen