Search Sciweavers | Sciweavers

1398 search results - page 3 / 280

» Bayesian actor-critic algorithms

174

click to vote

IJCNN
2006
IEEE

127views Neural Networks» more IJCNN 2006»

Reinforcement Learning for Parameterized Motor Primitives

16 years 20 days ago

Download www-clmc.usc.edu

Abstract— One of the major challenges in both action generation for robotics and in the understanding of human motor control is to learn the “building blocks of movement genera...

Jan Peters, Stefan Schaal

claim paper

Read More »

168

Voted

ICML
2009
IEEE

148views Machine Learning» more ICML 2009»

Predictive representations for policy gradient in POMDPs

16 years 7 months ago

Download damas.ift.ulaval.ca

We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...

Abdeslam Boularias, Brahim Chaib-draa

claim paper

Read More »

181

click to vote

ECML
2007
Springer

192views Machine Learning» more ECML 2007»

Policy Gradient Critics

16 years 25 days ago

Download www.idsia.ch

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...

Daan Wierstra, Jürgen Schmidhuber

claim paper

Read More »

194

click to vote

ICML
2010
IEEE

222views Machine Learning» more ICML 2010»

Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda

15 years 4 months ago

Download www.icml2010.org

Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...

Carlton Downey, Scott Sanner

claim paper

Read More »

195

click to vote

JMLR
2012

183views Programming Languages» more JMLR 2012»

Bayesian Comparison of Machine Learning Algorithms on Single and Multiple Datasets

13 years 9 months ago

Download jmlr.csail.mit.edu

We propose a new method for comparing learning algorithms on multiple tasks which is based on a novel non-parametric test that we call the Poisson binomial test. The key aspect of...

Alexandre Lacoste, François Laviolette, Mar...

claim paper

Read More »

« Prev « First page 3 / 280 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers