Search Sciweavers | Sciweavers

226 search results - page 28 / 46

» A Convergent Reinforcement Learning Algorithm in the Continu...

click to vote

ICML
2003
IEEE

105views Machine Learning» more ICML 2003»

Principled Methods for Advising Reinforcement Learning Agents

14 years 8 months ago

Download www.hpl.hp.com

An important issue in reinforcement learning is how to incorporate expert knowledge in a principled manner, especially as we scale up to real-world tasks. In this paper, we presen...

Eric Wiewiora, Garrison W. Cottrell, Charles Elkan

claim paper

Read More »

click to vote

NIPS
2008

165views Information Technology» more NIPS 2008»

Regularized Policy Iteration

13 years 8 months ago

Download webdocs.cs.ualberta.ca

In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...

Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...

claim paper

Read More »

click to vote

ATAL
2006
Springer

135views Intelligent Agents» more ATAL 2006»

Learning the required number of agents for complex tasks

13 years 11 months ago

Download www.damas.ift.ulaval.ca

Coordinating agents in a complex environment is a hard problem, but it can become even harder when certain characteristics of the tasks, like the required number of agents, are un...

Sébastien Paquet, Brahim Chaib-draa

claim paper

Read More »

click to vote

ROBOCUP
2000
Springer

130views Robotics» more ROBOCUP 2000»

Improvement Continuous Valued Q-learning and Its Application to Vision Guided Behavior Acquisition

13 years 11 months ago

Download www.er.ams.eng.osaka-u.ac.jp

Q-learning, a most widely used reinforcement learning method, normally needs well-defined quantized state and action spaces to converge. This makes it difficult to be applied to re...

Yasutake Takahashi, Masanori Takeda, Minoru Asada

claim paper

Read More »

click to vote

ICML
2001
IEEE

185views Machine Learning» more ICML 2001»

Off-Policy Temporal Difference Learning with Function Approximation

14 years 8 months ago

Download www.cs.ualberta.ca

We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...

Doina Precup, Richard S. Sutton, Sanjoy Dasgupta

claim paper

Read More »

« Prev « First page 28 / 46 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers