In this paper, we describe the partially observable Markov decision process pomdp approach to nding optimal or near-optimal control strategies for partially observable stochastic environments, given a complete model of the environment. The pomdp approach was originally developed in the operations research community and provides a formal basis for planning problems that have been of interest to the AI community. We found the existing algorithms for computing optimal control strategies to be highly computationally ine cient and have developed a new algorithm that is empirically more e cient. We sketch this algorithm and present preliminary results on several small problems that illustrate important properties of the pomdp approach.
Anthony R. Cassandra, Leslie Pack Kaelbling, Micha