The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems, especially in robotics. However exact s...
Joelle Pineau, Geoffrey J. Gordon, Sebastian Thrun
We describe a point-based policy iteration (PBPI) algorithm for infinite-horizon POMDPs. PBPI replaces the exact policy improvement step of Hansen’s policy iteration with point...
Shihao Ji, Ronald Parr, Hui Li, Xuejun Liao, Lawre...
Recent scaling up of POMDP solvers towards realistic applications is largely due to point-based methods such as PBVI, Perseus, and HSVI, which quickly converge to an approximate so...
This paper introduces the Point-Based Value Iteration (PBVI) algorithm for POMDP planning. PBVI approximates an exact value iteration solution by selecting a small set of represen...
Joelle Pineau, Geoffrey J. Gordon, Sebastian Thrun
Abstract— We present a simple randomized POMDP algorithm for planning with continuous actions in partially observable environments. Our algorithm operates on a set of reachable b...