Sciweavers

91 search results - page 6 / 19
» Parameter-exploring policy gradients
Sort
View
ECML
2007
Springer
14 years 1 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber
ICML
2003
IEEE
14 years 8 months ago
Hierarchical Policy Gradient Algorithms
Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning...
Mohammad Ghavamzadeh, Sridhar Mahadevan
IROS
2006
IEEE
113views Robotics» more  IROS 2006»
14 years 1 months ago
Policy Gradient Methods for Robotics
— The aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely pre-struc...
Jan Peters, Stefan Schaal
ICANN
2010
Springer
13 years 8 months ago
Policy Gradients for Cryptanalysis
So-called Physical Unclonable Functions are an emerging, new cryptographic and security primitive. They can potentially replace secret binary keys in vulnerable hardware systems an...
Frank Sehnke, Christian Osendorfer, Jan Sölte...
NIPS
2008
13 years 9 months ago
Particle Filter-based Policy Gradient in POMDPs
Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the bel...
Pierre-Arnaud Coquelin, Romain Deguest, Rém...