Search Sciweavers | Sciweavers

91 search results - page 6 / 19

» Parameter-exploring policy gradients

158

click to vote

ECML
2007
Springer

192views Machine Learning» more ECML 2007»

Policy Gradient Critics

15 years 12 months ago

Download www.idsia.ch

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...

Daan Wierstra, Jürgen Schmidhuber

claim paper

Read More »

144

click to vote

ICML
2003
IEEE

151views Machine Learning» more ICML 2003»

Hierarchical Policy Gradient Algorithms

16 years 6 months ago

Download www.hpl.hp.com

Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning...

Mohammad Ghavamzadeh, Sridhar Mahadevan

claim paper

Read More »

172

click to vote

IROS
2006
IEEE

113views Robotics» more IROS 2006»

Policy Gradient Methods for Robotics

15 years 11 months ago

Download www.cs.utah.edu

— The aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely pre-struc...

Jan Peters, Stefan Schaal

claim paper

Read More »

190

click to vote

ICANN
2010
Springer

201views Neural Networks» more ICANN 2010»

Policy Gradients for Cryptanalysis

15 years 6 months ago

Download www6.in.tum.de

So-called Physical Unclonable Functions are an emerging, new cryptographic and security primitive. They can potentially replace secret binary keys in vulnerable hardware systems an...

Frank Sehnke, Christian Osendorfer, Jan Sölte...

claim paper

Read More »

158

click to vote

NIPS
2008

116views Information Technology» more NIPS 2008»

Particle Filter-based Policy Gradient in POMDPs

15 years 7 months ago

Download eprints.pascal-network.org

Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the bel...

Pierre-Arnaud Coquelin, Romain Deguest, Rém...

claim paper

Read More »

« Prev « First page 6 / 19 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers