Q-value functions for decentralized POMDPs

16 years 22 days ago

Download www.science.uva.nl

Planning in single-agent models like MDPs and POMDPs can be carried out by resorting to Q-value functions: a (near-) optimal Q-value function is computed in a recursive manner by dynamic programming, and then a policy is extracted from this value function. In this paper we study whether similar Q-value functions can be deﬁned in decentralized POMDP models (Dec-POMDPs), what the cost of computing such value functions is, and how policies can be extracted from such value functions. Using the framework of Bayesian games, we argue that searching for the optimal Q-value function may be as costly as exhaustive policy search. Then we analyze various approximate Q-value functions that allow eﬃcient computation. Finally, we describe a family of algorithms for extracting policies from such Qvalue functions. Categories and Subject Descriptors I.2.11 [Artiﬁcial Intelligence]: Distributed Artiﬁcial Intelligence—Multiagent systems General Terms Algorithms, Performance, Experimentation, Th...

Frans A. Oliehoek, Nikos A. Vlassis

Real-time Traffic

ATAL 2007 | Optimal Q-value Function | Q-value Functions | Value Functions |

claim paper

» Taming Decentralized POMDPs Towards Efficient Policy Computation for Multiagent Settings

» Exploiting locality of interaction in factored DecPOMDPs

» Multiagent Planning Under Uncertainty with Stochastic Communication Delays

» Pointbased incremental pruning heuristic for solving finitehorizon DECPOMDPs

Post Info
More Details (n/a)

Added	07 Jun 2010
Updated	07 Jun 2010
Type	Conference
Year	2007
Where	ATAL
Authors	Frans A. Oliehoek, Nikos A. Vlassis

Comments (0)

Sciweavers

Q-value functions for decentralized POMDPs

ATAL 2007 | Optimal Q-value Function | Q-value Functions | Value Functions |

Explore & Download

Productivity Tools

Sciweavers