Search Sciweavers | Sciweavers

272 search results - page 14 / 55

» Parallel Reinforcement Learning with Linear Function Approxi...

161

click to vote

ICML
2005
IEEE

145views Machine Learning» more ICML 2005»

Proto-value functions: developmental reinforcement learning

16 years 7 months ago

Download www.cs.umass.edu

This paper presents a novel framework called proto-reinforcement learning (PRL), based on a mathematical model of a proto-value function: these are task-independent basis function...

Sridhar Mahadevan

claim paper

Read More »

188

click to vote

ICML
2008
IEEE

117views Machine Learning» more ICML 2008»

Sample-based learning and search with permanent and transient memories

16 years 7 months ago

Download www.cs.ualberta.ca

We present a reinforcement learning architecture, Dyna-2, that encompasses both samplebased learning and sample-based search, and that generalises across states during both learni...

David Silver, Martin Müller 0003, Richard S. ...

claim paper

Read More »

163

Voted

PKDD
2009
Springer

144views Data Mining» more PKDD 2009»

Compositional Models for Reinforcement Learning

16 years 1 months ago

Download userweb.cs.utexas.edu

Abstract. Innovations such as optimistic exploration, function approximation, and hierarchical decomposition have helped scale reinforcement learning to more complex environments, ...

Nicholas K. Jong, Peter Stone

claim paper

Read More »

167

click to vote

ICML
2008
IEEE

165views Machine Learning» more ICML 2008»

A worst-case comparison between temporal difference and residual gradient with linear function approximation

16 years 7 months ago

Download www.research.rutgers.edu

Residual gradient (RG) was proposed as an alternative to TD(0) for policy evaluation when function approximation is used, but there exists little formal analysis comparing them ex...

Lihong Li

claim paper

Read More »

149

click to vote

ICML
2003
IEEE

146views Machine Learning» more ICML 2003»

TD(0) Converges Provably Faster than the Residual Gradient Algorithm

16 years 7 months ago

Download www.hpl.hp.com

In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the ...

Ralf Schoknecht, Artur Merke

claim paper

Read More »

« Prev « First page 14 / 55 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers