Search Sciweavers | Sciweavers

29 search results - page 5 / 6

» Value-Directed Belief State Approximation for POMDPs

215

click to vote

NIPS
2007

207views Information Technology» more NIPS 2007»

Bayes-Adaptive POMDPs

15 years 7 months ago

Download books.nips.cc

Bayesian Reinforcement Learning has generated substantial interest recently, as it provides an elegant solution to the exploration-exploitation trade-off in reinforcement learning...

Stéphane Ross, Brahim Chaib-draa, Joelle Pi...

claim paper

Read More »

154

click to vote

IAT
2008
IEEE

151views Intelligent Agents» more IAT 2008»

Introducing Communication in Dis-POMDPs with Locality of Interaction

16 years 5 days ago

Download teamcore.usc.edu

The Networked Distributed POMDPs (ND-POMDPs) can model multiagent systems in uncertain domains and has begun to scale-up the number of agents. However, prior work in ND-POMDPs has ...

Makoto Tasaki, Yuichi Yabu, Yuki Iwanari, Makoto Y...

claim paper

Read More »

162

click to vote

ATAL
2010
Springer

164views Intelligent Agents» more ATAL 2010»

Point-based policy generation for decentralized POMDPs

15 years 6 months ago

Download anytime.cs.umass.edu

Memory-bounded techniques have shown great promise in solving complex multi-agent planning problems modeled as DEC-POMDPs. Much of the performance gains can be attributed to pruni...

Feng Wu, Shlomo Zilberstein, Xiaoping Chen

claim paper

Read More »

179

click to vote

FOCS
2007
IEEE

157views Theoretical Computer Science» more FOCS 2007»

Approximation Algorithms for Partial-Information Based Stochastic Control with Markovian Rewards

16 years 1 days ago

Download www.cis.upenn.edu

We consider a variant of the classic multi-armed bandit problem (MAB), which we call FEEDBACK MAB, where the reward obtained by playing each of n independent arms varies according...

Sudipto Guha, Kamesh Munagala

claim paper

Read More »

149

click to vote

AAAI
2007

126views Intelligent Agents» more AAAI 2007»

Point-Based Policy Iteration

15 years 8 months ago

Download www.cs.duke.edu

We describe a point-based policy iteration (PBPI) algorithm for inﬁnite-horizon POMDPs. PBPI replaces the exact policy improvement step of Hansen’s policy iteration with point...

Shihao Ji, Ronald Parr, Hui Li, Xuejun Liao, Lawre...

claim paper

Read More »

« Prev « First page 5 / 6 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers