Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estim...
We consider the fundamental problem of monitoring (i.e. tracking) the belief state in a dynamic system, when the model is only approximately correct and when the initial belief st...
We address the problem of optimally controlling stochastic environments that are partially observable. The standard method for tackling such problems is to define and solve a Part...
A general and expressive model of sequential decision making under uncertainty is provided by the Markov decision processes (MDPs) framework. Complex applications with very large ...
We consider sensor scheduling as the optimal observability problem for partially observable Markov decision processes (POMDP). This model fits to the cases where a Markov process ...