Search Sciweavers | Sciweavers

45 search results - page 6 / 9

» Efficient exploration through active learning for value func...

click to vote

RSS
2007

176views Robotics» more RSS 2007»

Active Policy Learning for Robot Planning and Exploration under Uncertainty

13 years 9 months ago

Download www.roboticsproceedings.org

Abstract— This paper proposes a simulation-based active policy learning algorithm for ﬁnite-horizon, partially-observed sequential decision processes. The algorithm is tested i...

Ruben Martinez-Cantin, Nando de Freitas, Arnaud Do...

claim paper

Read More »

click to vote

NIPS
2007

207views Information Technology» more NIPS 2007»

Bayes-Adaptive POMDPs

13 years 9 months ago

Download books.nips.cc

Bayesian Reinforcement Learning has generated substantial interest recently, as it provides an elegant solution to the exploration-exploitation trade-off in reinforcement learning...

Stéphane Ross, Brahim Chaib-draa, Joelle Pi...

claim paper

Read More »

click to vote

CORR
2010
Springer

152views Education» more CORR 2010»

Neuroevolutionary optimization

13 years 7 months ago

Download jmlr.csail.mit.edu

Temporal difference methods are theoretically grounded and empirically effective methods for addressing reinforcement learning problems. In most real-world reinforcement learning ...

Eva Volná

claim paper

Read More »

click to vote

UAI
2008

242views Artificial Intelligence» more UAI 2008»

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

13 years 9 months ago

Download uai2008.cs.helsinki.fi

We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...

Richard S. Sutton, Csaba Szepesvári, Alborz...

claim paper

Read More »

click to vote

ATAL
2008
Springer

123views Intelligent Agents» more ATAL 2008»

Sigma point policy iteration

13 years 9 months ago

Download web.mit.edu

In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...

Michael H. Bowling, Alborz Geramifard, David Winga...

claim paper

Read More »

« Prev « First page 6 / 9 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers