Search Sciweavers | Sciweavers

95 search results - page 4 / 19

» Policy Gradients for Cryptanalysis

210

click to vote

JMLR
2006

124views more JMLR 2006»

Policy Gradient in Continuous Time

15 years 6 months ago

Download hal.inria.fr

Policy search is a method for approximately solving an optimal control problem by performing a parametric optimization search in a given class of parameterized policies. In order ...

Rémi Munos

claim paper

Read More »

160

click to vote

ICML
2008
IEEE

110views Machine Learning» more ICML 2008»

Non-parametric policy gradients: a unified treatment of propositional and relational domains

16 years 7 months ago

Download www-kd.iai.uni-bonn.de

Policy gradient approaches are a powerful instrument for learning how to interact with the environment. Existing approaches have focused on propositional and continuous domains on...

Kristian Kersting, Kurt Driessens

claim paper

Read More »

171

click to vote

ESANN
2008

115views Neural Networks» more ESANN 2008»

15 years 8 months ago

Similarities and differences between policy gradient methods and evolution strategies

Download www.dice.ucl.ac.be

Natural policy gradient methods and the covariance matrix adaptation evolution strategy, two variable metric methods proposed for solving reinforcement learning tasks, are contrast...

Verena Heidrich-Meisner, Christian Igel

claim paper

Read More »

183

click to vote

IJCAI
2001

163views Artificial Intelligence» more IJCAI 2001»

Exploiting Multiple Secondary Reinforcers in Policy Gradient Reinforcement Learning

15 years 8 months ago

Download www.cs.colorado.edu

Most formulations of Reinforcement Learning depend on a single reinforcement reward value to guide the search for the optimal policy solution. If observation of this reward is rar...

Gregory Z. Grudic, Lyle H. Ungar

claim paper

Read More »

154

Voted

ICANN
2007
Springer

95views Neural Networks» more ICANN 2007»

Solving Deep Memory POMDPs with Recurrent Policy Gradients

16 years 29 days ago

Download www.idsia.ch

Abstract. This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov...

Daan Wierstra, Alexander Förster, Jan Peters,...

claim paper

Read More »

« Prev « First page 4 / 19 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers