Search Sciweavers | Sciweavers

1233 search results - page 174 / 247

» Feudal Reinforcement Learning

205

click to vote

ECML
2006
Springer

146views Machine Learning» more ECML 2006»

Task-Driven Discretization of the Joint Space of Visual Percepts and Continuous Actions

15 years 9 months ago

Download www.montefiore.ulg.ac.be

We target the problem of closed-loop learning of control policies that map visual percepts to continuous actions. Our algorithm, called Reinforcement Learning of Joint Classes (RLJ...

Sébastien Jodogne, Justus H. Piater

claim paper

Read More »

162

click to vote

NIPS
2007

164views Information Technology» more NIPS 2007»

Incremental Natural Actor-Critic Algorithms

15 years 7 months ago

Download books.nips.cc

We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...

Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...

claim paper

Read More »

145

click to vote

ICML
2009
IEEE

131views Machine Learning» more ICML 2009»

Monte-Carlo simulation balancing

16 years 6 months ago

Download www.cs.ualberta.ca

In this paper we introduce the first algorithms for efficiently learning a simulation policy for Monte-Carlo search. Our main idea is to optimise the balance of a simulation polic...

David Silver, Gerald Tesauro

claim paper

Read More »

192

click to vote

ICML
2001
IEEE

159views Machine Learning» more ICML 2001»

Direct Policy Search using Paired Statistical Tests

16 years 6 months ago

Download www.autonlab.org

Direct policy search is a practical way to solve reinforcement learning problems involving continuous state and action spaces. The goal becomes finding policy parameters that maxi...

Malcolm J. A. Strens, Andrew W. Moore

claim paper

Read More »

161

click to vote

ECML
2007
Springer

192views Machine Learning» more ECML 2007»

Policy Gradient Critics

16 years 1 days ago

Download www.idsia.ch

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...

Daan Wierstra, Jürgen Schmidhuber

claim paper

Read More »

« Prev « First page 174 / 247 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers