Sciweavers

1398 search results - page 3 / 280
» Bayesian actor-critic algorithms
Sort
View
IJCNN
2006
IEEE
14 years 1 months ago
Reinforcement Learning for Parameterized Motor Primitives
Abstract— One of the major challenges in both action generation for robotics and in the understanding of human motor control is to learn the “building blocks of movement genera...
Jan Peters, Stefan Schaal
ICML
2009
IEEE
14 years 8 months ago
Predictive representations for policy gradient in POMDPs
We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...
Abdeslam Boularias, Brahim Chaib-draa
ECML
2007
Springer
14 years 1 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber
ICML
2010
IEEE
13 years 5 months ago
Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda
Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...
Carlton Downey, Scott Sanner
JMLR
2012
11 years 10 months ago
Bayesian Comparison of Machine Learning Algorithms on Single and Multiple Datasets
We propose a new method for comparing learning algorithms on multiple tasks which is based on a novel non-parametric test that we call the Poisson binomial test. The key aspect of...
Alexandre Lacoste, François Laviolette, Mar...