Sciweavers

363 search results - page 25 / 73
» uais 2008
Sort
View
UAI
2001
14 years 4 days ago
Vector-space Analysis of Belief-state Approximation for POMDPs
We propose a new approach to value-directed belief state approximationfor POMDPs. The valuedirected model allows one to choose approximation methods for belief state monitoringtha...
Pascal Poupart, Craig Boutilier
UAI
2001
14 years 4 days ago
Policy Improvement for POMDPs Using Normalized Importance Sampling
We present a new method for estimating the expected return of a POMDP from experience. The estimator does not assume any knowledge of the POMDP, can estimate the returns for finit...
Christian R. Shelton
UAI
2001
14 years 4 days ago
The Optimal Reward Baseline for Gradient-Based Reinforcement Learning
There exist a number of reinforcement learning algorithms which learn by climbing the gradient of expected reward. Their long-run convergence has been proved, even in partially ob...
Lex Weaver, Nigel Tao
UAI
2004
14 years 4 days ago
Evidence-invariant Sensitivity Bounds
The sensitivities revealed by a sensitivity analysis of a probabilistic network typically depend on the entered evidence. For a real-life network therefore, the analysis is perfor...
Silja Renooij, Linda C. van der Gaag
UAI
2004
14 years 4 days ago
The Minimum Information Principle for Discriminative Learning
Exponential models of distributions are widely used in machine learning for classification and modelling. It is well known that they can be interpreted as maximum entropy models u...
Amir Globerson, Naftali Tishby