Search Sciweavers | Sciweavers

14 search results - page 1 / 3

» Sensitive Discount Optimality: Unifying Discounted and Avera...

162

click to vote

ICML
1996
IEEE

162views Machine Learning» more ICML 1996»

Sensitive Discount Optimality: Unifying Discounted and Average Reward Reinforcement Learning

16 years 6 months ago

Download reference.kfupm.edu.sa

Research in reinforcementlearning (RL)has thus far concentrated on two optimality criteria: the discounted framework, which has been very well-studied, and the averagereward frame...

Sridhar Mahadevan

claim paper

Read More »

155

click to vote

ICML
2001
IEEE

172views Machine Learning» more ICML 2001»

Continuous-Time Hierarchical Reinforcement Learning

16 years 6 months ago

Download www.cs.ualberta.ca

Hierarchical reinforcement learning (RL) is a general framework which studies how to exploit the structure of actions and tasks to accelerate policy learning in large domains. Pri...

Mohammad Ghavamzadeh, Sridhar Mahadevan

claim paper

Read More »

156

click to vote

ICML
2002
IEEE

146views Machine Learning» more ICML 2002»

Hierarchically Optimal Average Reward Reinforcement Learning

16 years 6 months ago

Download www.cs.ualberta.ca

Two notions of optimality have been explored in previous work on hierarchical reinforcement learning (HRL): hierarchical optimality, or the optimal policy in the space defined by ...

Mohammad Ghavamzadeh, Sridhar Mahadevan

claim paper

Read More »

212

click to vote

AI
1998
Springer

177views Artificial Intelligence» more AI 1998»

Model-Based Average Reward Reinforcement Learning

15 years 5 months ago

Download web.engr.oregonstate.edu

Reinforcement Learning (RL) is the study of programs that improve their performance by receiving rewards and punishments from the environment. Most RL methods optimize the discoun...

Prasad Tadepalli, DoKyeong Ok

claim paper

Read More »

132

click to vote

UAI
2001

129views Artificial Intelligence» more UAI 2001»

The Optimal Reward Baseline for Gradient-Based Reinforcement Learning

15 years 6 months ago

Download cs.anu.edu.au

There exist a number of reinforcement learning algorithms which learn by climbing the gradient of expected reward. Their long-run convergence has been proved, even in partially ob...

Lex Weaver, Nigel Tao

claim paper

Read More »

« Prev « First page 1 / 3 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers