Sciweavers

32 search results - page 4 / 7
» Optimal and Approximate Q-value Functions for Decentralized ...
Sort
View
AAAI
2006
13 years 9 months ago
Point-based Dynamic Programming for DEC-POMDPs
We introduce point-based dynamic programming (DP) for decentralized partially observable Markov decision processes (DEC-POMDPs), a new discrete DP algorithm for planning strategie...
Daniel Szer, François Charpillet
TIT
2008
90views more  TIT 2008»
13 years 7 months ago
On Optimal Quantization Rules for Some Problems in Sequential Decentralized Detection
We consider the design of systems for sequential decentralized detection, a problem that entails several interdependent choices: the choice of a stopping rule (specifying the samp...
XuanLong Nguyen, Martin J. Wainwright, Michael I. ...
AAAI
2010
13 years 9 months ago
Symbolic Dynamic Programming for First-order POMDPs
Partially-observable Markov decision processes (POMDPs) provide a powerful model for sequential decision-making problems with partially-observed state and are known to have (appro...
Scott Sanner, Kristian Kersting
NIPS
2007
13 years 9 months ago
Bayes-Adaptive POMDPs
Bayesian Reinforcement Learning has generated substantial interest recently, as it provides an elegant solution to the exploration-exploitation trade-off in reinforcement learning...
Stéphane Ross, Brahim Chaib-draa, Joelle Pi...
ICML
2000
IEEE
14 years 8 months ago
Reinforcement Learning in POMDP's via Direct Gradient Ascent
This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled ??? ?s. We introduce ??? ?, a...
Jonathan Baxter, Peter L. Bartlett