Search Sciweavers | Sciweavers

47 search results - page 5 / 10

» Reinforcement learning with function approximation for coope...

click to vote

FLAIRS
2003

141views Artificial Intelligence» more FLAIRS 2003»

Learning from Reinforcement and Advice Using Composite Reward Functions

13 years 8 months ago

Download ranger.uta.edu

1 Reinforcement learning has become a widely used methodology for creating intelligent agents in a wide range of applications. However, its performance deteriorates in tasks with s...

Vinay N. Papudesi, Manfred Huber

claim paper

Read More »

click to vote

ATAL
2010
Springer

181views Intelligent Agents» more ATAL 2010»

Basis function construction for hierarchical reinforcement learning

13 years 8 months ago

Download www.cs.brown.edu

This paper introduces an approach to automatic basis function construction for Hierarchical Reinforcement Learning (HRL) tasks. We describe some considerations that arise when con...

Sarah Osentoski, Sridhar Mahadevan

claim paper

Read More »

click to vote

IJON
2006

90views more IJON 2006»

Reinforcement learning of a simple control task using the spike response model

13 years 7 months ago

Download www.xdr.com

In this work, we propose a variation of a direct reinforcement learning algorithm, suitable for usage with spiking neurons based on the spike response model (SRM). The SRM is a bi...

Murilo Saraiva de Queiroz, Roberto Coelho de Berr&...

claim paper

Read More »

click to vote

ATAL
2007
Springer

122views Intelligent Agents» more ATAL 2007»

Reducing the complexity of multiagent reinforcement learning

14 years 1 months ago

Download www.damas.ift.ulaval.ca

It is known that the complexity of the reinforcement learning algorithms, such as Q-learning, may be exponential in the number of environment’s states. It was shown, however, th...

Andriy Burkov, Brahim Chaib-draa

claim paper

Read More »

click to vote

ICML
2001
IEEE

185views Machine Learning» more ICML 2001»

Off-Policy Temporal Difference Learning with Function Approximation

14 years 7 months ago

Download www.cs.ualberta.ca

We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...

Doina Precup, Richard S. Sutton, Sanjoy Dasgupta

claim paper

Read More »

« Prev « First page 5 / 10 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers