Search Sciweavers | Sciweavers

147 search results - page 2 / 30

» Policy Gradient in Continuous Time

153

Voted

ICML
2003
IEEE

151views Machine Learning» more ICML 2003»

Hierarchical Policy Gradient Algorithms

16 years 7 months ago

Download www.hpl.hp.com

Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning...

Mohammad Ghavamzadeh, Sridhar Mahadevan

claim paper

Read More »

148

click to vote

AIPS
2007

119views Artificial Intelligence» more AIPS 2007»

Concurrent Probabilistic Temporal Planning with Policy-Gradients

15 years 9 months ago

Download eprints.pascal-network.org

We present an any-time concurrent probabilistic temporal planner that includes continuous and discrete uncertainties and metric functions. Our approach is a direct policy search t...

Douglas Aberdeen, Olivier Buffet

claim paper

Read More »

175

Voted

AAAI
2010

191views Intelligent Agents» more AAAI 2010»

Relative Entropy Policy Search

15 years 8 months ago

Download www.kyb.tuebingen.mpg.de

Policy search is a successful approach to reinforcement learning. However, policy improvements often result in the loss of information. Hence, it has been marred by premature conv...

Jan Peters, Katharina Mülling, Yasemin Altun

claim paper

Read More »

172

click to vote

NIPS
2008

116views Information Technology» more NIPS 2008»

Particle Filter-based Policy Gradient in POMDPs

15 years 8 months ago

Download eprints.pascal-network.org

Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the bel...

Pierre-Arnaud Coquelin, Romain Deguest, Rém...

claim paper

Read More »

182

click to vote

ECML
2007
Springer

192views Machine Learning» more ECML 2007»

Policy Gradient Critics

16 years 25 days ago

Download www.idsia.ch

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...

Daan Wierstra, Jürgen Schmidhuber

claim paper

Read More »

« Prev « First page 2 / 30 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers