Search Sciweavers | Sciweavers

81 search results - page 1 / 17

» The Optimal Reward Baseline for Gradient-Based Reinforcement...

179

click to vote

UAI
2001

129views Artificial Intelligence» more UAI 2001»

The Optimal Reward Baseline for Gradient-Based Reinforcement Learning

15 years 8 months ago

Download cs.anu.edu.au

There exist a number of reinforcement learning algorithms which learn by climbing the gradient of expected reward. Their long-run convergence has been proved, even in partially ob...

Lex Weaver, Nigel Tao

claim paper

Read More »

212

click to vote

GECCO
2004
Springer

122views Optimization» more GECCO 2004»

Gradient-Based Learning Updates Improve XCS Performance in Multistep Problems

16 years 24 days ago

Download www.cs.york.ac.uk

This paper introduces a gradient-based reward prediction update mechanism to the XCS classiﬁer system as applied in neuralnetwork type learning and function approximation mechani...

Martin V. Butz, David E. Goldberg, Pier Luca Lanzi

claim paper

Read More »

182

Voted

NIPS
2001

144views Information Technology» more NIPS 2001»

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

15 years 8 months ago

Download jmlr.csail.mit.edu

Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...

Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...

claim paper

Read More »

209

click to vote

IJCAI
2001

163views Artificial Intelligence» more IJCAI 2001»

Exploiting Multiple Secondary Reinforcers in Policy Gradient Reinforcement Learning

15 years 8 months ago

Download www.cs.colorado.edu

Most formulations of Reinforcement Learning depend on a single reinforcement reward value to guide the search for the optimal policy solution. If observation of this reward is rar...

Gregory Z. Grudic, Lyle H. Ungar

claim paper

Read More »

198

Voted

ICML
2002
IEEE

146views Machine Learning» more ICML 2002»

Hierarchically Optimal Average Reward Reinforcement Learning

16 years 8 months ago

Download www.cs.ualberta.ca

Two notions of optimality have been explored in previous work on hierarchical reinforcement learning (HRL): hierarchical optimality, or the optimal policy in the space defined by ...

Mohammad Ghavamzadeh, Sridhar Mahadevan

claim paper

Read More »

« Prev « First page 1 / 17 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers