Abstract— We propose a planning algorithm that allows usersupplied domain knowledge to be exploited in the synthesis of information feedback policies for systems modeled as partially observable Markov decision processes (POMDPs). POMDP models, which are increasingly popular in the robotics literature, permit a planner to consider future uncertainty in both the application of actions and sensing of observations. With our approach, domain experts can inject specialized knowledge into the planning process by providing a set of local policies that are used as primitives by the planner. If the local policies are chosen appropriately, the planner can evaluate further into the future, even for large problems, which can lead to better overall policies at decreased computational cost. We use a structured approach to encode the provided domain knowledge into the value function approximation. We demonstrate our approach on a multi-robot fire fighting problem, in which a team of robots coopera...
Salvatore Candido, James C. Davidson, Seth Hutchin