Sciweavers

29 search results - page 5 / 6
» Value-Directed Belief State Approximation for POMDPs
Sort
View
NIPS
2007
13 years 10 months ago
Bayes-Adaptive POMDPs
Bayesian Reinforcement Learning has generated substantial interest recently, as it provides an elegant solution to the exploration-exploitation trade-off in reinforcement learning...
Stéphane Ross, Brahim Chaib-draa, Joelle Pi...
IAT
2008
IEEE
14 years 3 months ago
Introducing Communication in Dis-POMDPs with Locality of Interaction
The Networked Distributed POMDPs (ND-POMDPs) can model multiagent systems in uncertain domains and has begun to scale-up the number of agents. However, prior work in ND-POMDPs has ...
Makoto Tasaki, Yuichi Yabu, Yuki Iwanari, Makoto Y...
ATAL
2010
Springer
13 years 9 months ago
Point-based policy generation for decentralized POMDPs
Memory-bounded techniques have shown great promise in solving complex multi-agent planning problems modeled as DEC-POMDPs. Much of the performance gains can be attributed to pruni...
Feng Wu, Shlomo Zilberstein, Xiaoping Chen
FOCS
2007
IEEE
14 years 2 months ago
Approximation Algorithms for Partial-Information Based Stochastic Control with Markovian Rewards
We consider a variant of the classic multi-armed bandit problem (MAB), which we call FEEDBACK MAB, where the reward obtained by playing each of n independent arms varies according...
Sudipto Guha, Kamesh Munagala
AAAI
2007
13 years 11 months ago
Point-Based Policy Iteration
We describe a point-based policy iteration (PBPI) algorithm for infinite-horizon POMDPs. PBPI replaces the exact policy improvement step of Hansen’s policy iteration with point...
Shihao Ji, Ronald Parr, Hui Li, Xuejun Liao, Lawre...