Search Sciweavers | Sciweavers

536 search results - page 28 / 108

» Residual Algorithms: Reinforcement Learning with Function Ap...

click to vote

ECML
2005
Springer

193views Machine Learning» more ECML 2005»

Natural Actor-Critic

14 years 1 months ago

Download www-clmc.usc.edu

This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing Amari...

Jan Peters, Sethu Vijayakumar, Stefan Schaal

claim paper

Read More »

click to vote

IJCAI
2007

254views Artificial Intelligence» more IJCAI 2007»

Bayesian Inverse Reinforcement Learning

13 years 9 months ago

Download www.ijcai.org

Inverse Reinforcement Learning (IRL) is the problem of learning the reward function underlying a Markov Decision Process given the dynamics of the system and the behaviour of an e...

Deepak Ramachandran, Eyal Amir

claim paper

Read More »

click to vote

ICML
1999
IEEE

168views Machine Learning» more ICML 1999»

Least-Squares Temporal Difference Learning

14 years 8 months ago

Download www.research.rutgers.edu

Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...

Justin A. Boyan

claim paper

Read More »

Voted

ICML
2009
IEEE

186views Machine Learning» more ICML 2009»

Regularization and feature selection in least-squares temporal difference learning

14 years 8 months ago

Download ai.stanford.edu

We consider the task of reinforcement learning with linear value function approximation. Temporal difference algorithms, and in particular the Least-Squares Temporal Difference (L...

J. Zico Kolter, Andrew Y. Ng

claim paper

Read More »

click to vote

ICML
2010
IEEE

282views Machine Learning» more ICML 2010»

Bayesian Multi-Task Reinforcement Learning

13 years 8 months ago

Download hal.inria.fr

We consider the problem of multi-task reinforcement learning where the learner is provided with a set of tasks, for which only a small number of samples can be generated for any g...

Alessandro Lazaric, Mohammad Ghavamzadeh

claim paper

Read More »

« Prev « First page 28 / 108 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers