There exist a number of reinforcement learning algorithms which learn by climbing the gradient of expected reward. Their long-run convergence has been proved, even in partially ob...
We study the problem of projecting a distribution onto (or finding a maximum likelihood distribution among) Markov networks of bounded tree-width. By casting it as the combinatori...
We present a new method for estimating the expected return of a POMDP from experience. The estimator does not assume any knowledge of the POMDP, can estimate the returns for finit...
We propose a new approach to value-directed belief state approximationfor POMDPs. The valuedirected model allows one to choose approximation methods for belief state monitoringtha...
There is increasing interest within the research community in the design and use of recursive probability models. There remains concern about computational complexity costs and th...
This paper presents a new deterministic approximation technique in Bayesian networks. This method, "Expectation Propagation," unifies two previous techniques: assumed-de...