Search Sciweavers | Sciweavers

95 search results - page 5 / 19

» Policy Gradients for Cryptanalysis

189

click to vote

ECML
2007
Springer

192views Machine Learning» more ECML 2007»

Policy Gradient Critics

16 years 29 days ago

Download www.idsia.ch

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...

Daan Wierstra, Jürgen Schmidhuber

claim paper

Read More »

191

click to vote

NN
2010
Springer

125views Neural Networks» more NN 2010»

Parameter-exploring policy gradients

15 years 5 months ago

Download www.kyb.mpg.de

We present a model-free reinforcement learning method for partially observable Markov decision problems. Our method estimates a likelihood gradient by sampling directly in paramet...

Frank Sehnke, Christian Osendorfer, Thomas Rü...

claim paper

Read More »

197

click to vote

ICMLA
2010

203views Machine Learning» more ICMLA 2010»

Multimodal Parameter-exploring Policy Gradients

15 years 4 months ago

Download www6.in.tum.de

Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...

Frank Sehnke, Alex Graves, Christian Osendorfer, J...

claim paper

Read More »

155

click to vote

ICML
2003
IEEE

151views Machine Learning» more ICML 2003»

Hierarchical Policy Gradient Algorithms

16 years 7 months ago

Download www.hpl.hp.com

Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning...

Mohammad Ghavamzadeh, Sridhar Mahadevan

claim paper

Read More »

192

click to vote

IROS
2006
IEEE

113views Robotics» more IROS 2006»

Policy Gradient Methods for Robotics

16 years 25 days ago

Download www.cs.utah.edu

— The aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely pre-struc...

Jan Peters, Stefan Schaal

claim paper

Read More »

« Prev « First page 5 / 19 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers