In sequential decision making under uncertainty, as in many other modeling endeavors, researchers observe a dynamical system and collect data measuring its behavior over time. These data are often used to build models that explain relationships between the measured variables, and are eventually used for planning and control purposes. However, these measurements cannot always be exact, systems can change over time, and discovering these facts or fixing these problems is not always feasible. Therefore it is important to formally describe the degree to which the model can tolerate noise, in order to keep near optimal behavior. The problem of finding tolerance bounds has been the focus of many studies for Markov Decision Processes (MDPs) due to their usefulness in practical applications. In this paper, we consider Partially Observable MDPs (POMDPs), which is a more realistic extension of MDPs with a wider scope of applications. We address two types of perturbations in POMDP model paramete...
Stéphane Ross, Masoumeh T. Izadi, Mark Merc