We develop a hierarchical approach to planning for partially observable Markov decision processes (POMDPs) in which a policy is represented as a hierarchical finite-state controller. To provide a foundation for this approach, we discuss some extensions of the POMDP framework that allow us to formalize the process of abstraction by which a hierarchical controller is constructed. We describe a planning algorithm that uses a programmer-defined task hierarchy to constrain the search space of finite-state controllers, and prove that this algorithm converges to a hierarchical finite-state controller that is ε-optimal in a limited but well-defined sense, related to the concept of recursive optimality.
Eric A. Hansen, Rong Zhou