Search Sciweavers | Sciweavers

46

RSS
2007

176views Robotics» more RSS 2007»

Active Policy Learning for Robot Planning and Exploration under Uncertainty

14 years 9 days ago

Abstract— This paper proposes a simulation-based active policy learning algorithm for ﬁnite-horizon, partially-observed sequential decision processes. The algorithm is tested i...

Ruben Martinez-Cantin, Nando de Freitas, Arnaud Do...

claim paper

Read More »

41

click to vote

NIPS
2001

206views Information Technology» more NIPS 2001»

Model-Free Least-Squares Policy Iteration

14 years 8 days ago

Download www.cs.duke.edu

We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...

Michail G. Lagoudakis, Ronald Parr

claim paper

Read More »

30

click to vote

UM
2005
Springer

160views Programming Languages» more UM 2005»

Bayesphone: Precomputation of Context-Sensitive Policies for Inquiry and Action in Mobile Devices

14 years 4 months ago

Download research.microsoft.com

Inference and decision making with probabilistic user models may be infeasible on portable devices such as cell phones. We highlight the opportunity for storing and using precomput...

Eric Horvitz, Paul Koch, Raman Sarin, Johnson Apac...

claim paper

Read More »

53

click to vote

IWLCS
2005
Springer

161views Machine Learning» more IWLCS 2005»

Counter Example for Q-Bucket-Brigade Under Prediction Problem

14 years 4 months ago

Download www.cs.bham.ac.uk

Aiming to clarify the convergence or divergence conditions for Learning Classiﬁer System (LCS), this paper explores: (1) an extreme condition where the reinforcement process of ...

Atsushi Wada, Keiki Takadama, Katsunori Shimohara

claim paper

Read More »

42

click to vote

ATAL
2010
Springer

171views Intelligent Agents» more ATAL 2010»

Closing the learning-planning loop with predictive state representations

14 years 8 hour ago

Download www.cs.cmu.edu

A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must learn an accurate model of ...

Byron Boots, Sajid M. Siddiqi, Geoffrey J. Gordon

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers