Sciweavers

95 search results - page 4 / 19
» Policy Gradients for Cryptanalysis
Sort
View
JMLR
2006
124views more  JMLR 2006»
13 years 7 months ago
Policy Gradient in Continuous Time
Policy search is a method for approximately solving an optimal control problem by performing a parametric optimization search in a given class of parameterized policies. In order ...
Rémi Munos
ICML
2008
IEEE
14 years 8 months ago
Non-parametric policy gradients: a unified treatment of propositional and relational domains
Policy gradient approaches are a powerful instrument for learning how to interact with the environment. Existing approaches have focused on propositional and continuous domains on...
Kristian Kersting, Kurt Driessens
ESANN
2008
13 years 9 months ago
Similarities and differences between policy gradient methods and evolution strategies
Natural policy gradient methods and the covariance matrix adaptation evolution strategy, two variable metric methods proposed for solving reinforcement learning tasks, are contrast...
Verena Heidrich-Meisner, Christian Igel
IJCAI
2001
13 years 9 months ago
Exploiting Multiple Secondary Reinforcers in Policy Gradient Reinforcement Learning
Most formulations of Reinforcement Learning depend on a single reinforcement reward value to guide the search for the optimal policy solution. If observation of this reward is rar...
Gregory Z. Grudic, Lyle H. Ungar
ICANN
2007
Springer
14 years 1 months ago
Solving Deep Memory POMDPs with Recurrent Policy Gradients
Abstract. This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov...
Daan Wierstra, Alexander Förster, Jan Peters,...