Closing the learning-planning loop with predictive state representations

14 years 4 months ago

Download www.cs.cmu.edu

A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must learn an accurate model of our environment, and then plan to maximize reward. Unfortunately, learning algorithms often recover a model which is too inaccurate to support planning or too large and complex for planning to be feasible; or, they require large amounts of prior domain knowledge or fail to provide important guarantees such as statistical consistency. To begin to fill this gap, we propose a novel algorithm which provably learns a compact, accurate model directly from sequences of action-observation pairs. To evaluate the learner, we then close the loop from observations to actions: we plan in the learned model and recover a policy which is nearoptimal in the original environment (not the model). In more detail, we present a spectral algorithm for learning a Predictive State Representation (PSR). We demonstrate the algorithm by...

Byron Boots, Sajid M. Siddiqi, Geoffrey J. Gordon

Real-time Traffic

Accurate Model | Algorithm | Approximate Point-based Planning | ATAL 2010 | Intelligent Agents |

claim paper

Post Info
More Details (n/a)

Added	08 Nov 2010
Updated	08 Nov 2010
Type	Conference
Year	2010
Where	ATAL
Authors	Byron Boots, Sajid M. Siddiqi, Geoffrey J. Gordon

Comments (0)

Sciweavers

Closing the learning-planning loop with predictive state representations

Accurate Model | Algorithm | Approximate Point-based Planning | ATAL 2010 | Intelligent Agents |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers