Compact, Convex Upper Bound Iteration for Approximate POMDP Planning

14 years 7 months ago

Download www.aaai.org

Partially observable Markov decision processes (POMDPs) are an intuitive and general way to model sequential decision making problems under uncertainty. Unfortunately, even approximate planning in POMDPs is known to be hard, and developing heuristic planners that can deliver reasonable results in practice has proved to be a significant challenge. In this paper, we present a new approach to approximate value-iteration for POMDP planning that is based on quadratic rather than piecewise linear function approximators. Specifically, we approximate the optimal value function by a convex upper bound composed of a fixed number of quadratics, and optimize it at each stage by semidefinite programming. We demonstrate that our approach can achieve competitive approximation quality to current techniques while still maintaining a bounded size representation of the function approximator. Moreover, an upper bound on the optimal value function can be preserved if required. Overall, the technique requi...

Tao Wang, Pascal Poupart, Michael H. Bowling, Dale

Real-time Traffic

AAAI 2006 | Function Approximator | Intelligent Agents | Optimal Value Function | Upper Bound |

claim paper

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2006
Where	AAAI
Authors	Tao Wang, Pascal Poupart, Michael H. Bowling, Dale Schuurmans

Comments (0)

Sciweavers

Compact, Convex Upper Bound Iteration for Approximate POMDP Planning

AAAI 2006 | Function Approximator | Intelligent Agents | Optimal Value Function | Upper Bound |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers