We propose a new approach to value-directed belief state approximationfor POMDPs. The valuedirected model allows one to choose approximation methods for belief state monitoringtha...
We present a new method for estimating the expected return of a POMDP from experience. The estimator does not assume any knowledge of the POMDP, can estimate the returns for finit...
There exist a number of reinforcement learning algorithms which learn by climbing the gradient of expected reward. Their long-run convergence has been proved, even in partially ob...
The sensitivities revealed by a sensitivity analysis of a probabilistic network typically depend on the entered evidence. For a real-life network therefore, the analysis is perfor...
Exponential models of distributions are widely used in machine learning for classification and modelling. It is well known that they can be interpreted as maximum entropy models u...