Sciweavers

ECSQARU
2001
Springer

Space-Progressive Value Iteration: An Anytime Algorithm for a Class of POMDPs

14 years 3 months ago
Space-Progressive Value Iteration: An Anytime Algorithm for a Class of POMDPs
Abstract. Finding optimal policies for general partially observable Markov decision processes (POMDPs) is computationally difficult primarily due to the need to perform dynamic-programming (DP) updates over the entire belief space. In this paper, we first study a somewhat restrictive class of special POMDPs called almost-discernible POMDPs and propose an anytime algorithm called spaceprogressive value iteration(SPVI). SPVI does not perform DP updates over the entire belief space. Rather it restricts DP updates to a belief subspace that grows over time. It is argued that given sufficient time SPVI can find near-optimal policies for almost-discernible POMDPs. We then show how SPVI can be applied to more a general class of POMDPs. Empirical results are presented to show the effectiveness of SPVI.
Nevin Lianwen Zhang, Weihong Zhang
Added 28 Jul 2010
Updated 28 Jul 2010
Type Conference
Year 2001
Where ECSQARU
Authors Nevin Lianwen Zhang, Weihong Zhang
Comments (0)