Search Sciweavers | Sciweavers

473 search results - page 39 / 95

» Optimal policy switching algorithms for reinforcement learni...

154

click to vote

ICML
2010
IEEE

282views Machine Learning» more ICML 2010»

Bayesian Multi-Task Reinforcement Learning

15 years 7 months ago

Download hal.inria.fr

We consider the problem of multi-task reinforcement learning where the learner is provided with a set of tasks, for which only a small number of samples can be generated for any g...

Alessandro Lazaric, Mohammad Ghavamzadeh

claim paper

Read More »

161

click to vote

ICML
2001
IEEE

172views Machine Learning» more ICML 2001»

Continuous-Time Hierarchical Reinforcement Learning

16 years 6 months ago

Download www.cs.ualberta.ca

Hierarchical reinforcement learning (RL) is a general framework which studies how to exploit the structure of actions and tasks to accelerate policy learning in large domains. Pri...

Mohammad Ghavamzadeh, Sridhar Mahadevan

claim paper

Read More »

167

click to vote

CORR
2010
Springer

105views Education» more CORR 2010»

Optimism in Reinforcement Learning Based on Kullback-Leibler Divergence

15 years 4 months ago

Download hal.archives-ouvertes.fr

We consider model-based reinforcement learning in ﬁnite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. Optimism is usually implemented by carryin...

Sarah Filippi, Olivier Cappé, Aurelien Gari...

claim paper

Read More »

192

click to vote

ICML
2001
IEEE

159views Machine Learning» more ICML 2001»

Direct Policy Search using Paired Statistical Tests

16 years 6 months ago

Download www.autonlab.org

Direct policy search is a practical way to solve reinforcement learning problems involving continuous state and action spaces. The goal becomes finding policy parameters that maxi...

Malcolm J. A. Strens, Andrew W. Moore

claim paper

Read More »

139

click to vote

UAI
2001

129views Artificial Intelligence» more UAI 2001»

The Optimal Reward Baseline for Gradient-Based Reinforcement Learning

15 years 7 months ago

Download cs.anu.edu.au

There exist a number of reinforcement learning algorithms which learn by climbing the gradient of expected reward. Their long-run convergence has been proved, even in partially ob...

Lex Weaver, Nigel Tao

claim paper

Read More »

« Prev « First page 39 / 95 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers