Sciweavers

147 search results - page 2 / 30
» Policy Gradient in Continuous Time
Sort
View
ICML
2003
IEEE
14 years 10 months ago
Hierarchical Policy Gradient Algorithms
Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning...
Mohammad Ghavamzadeh, Sridhar Mahadevan
AIPS
2007
13 years 12 months ago
Concurrent Probabilistic Temporal Planning with Policy-Gradients
We present an any-time concurrent probabilistic temporal planner that includes continuous and discrete uncertainties and metric functions. Our approach is a direct policy search t...
Douglas Aberdeen, Olivier Buffet
AAAI
2010
13 years 11 months ago
Relative Entropy Policy Search
Policy search is a successful approach to reinforcement learning. However, policy improvements often result in the loss of information. Hence, it has been marred by premature conv...
Jan Peters, Katharina Mülling, Yasemin Altun
NIPS
2008
13 years 11 months ago
Particle Filter-based Policy Gradient in POMDPs
Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the bel...
Pierre-Arnaud Coquelin, Romain Deguest, Rém...
ECML
2007
Springer
14 years 3 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber