Sciweavers

83 search results - page 14 / 17
» Planning and Acting in Partially Observable Stochastic Domai...
Sort
View
NIPS
2008
14 years 9 days ago
Multi-Agent Filtering with Infinitely Nested Beliefs
In partially observable worlds with many agents, nested beliefs are formed when agents simultaneously reason about the unknown state of the world and the beliefs of the other agen...
Luke S. Zettlemoyer, Brian Milch, Leslie Pack Kael...
ECML
2007
Springer
14 years 5 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber
HICSS
2003
IEEE
207views Biometrics» more  HICSS 2003»
14 years 4 months ago
Formalizing Multi-Agent POMDP's in the context of network routing
This paper uses partially observable Markov decision processes (POMDP’s) as a basic framework for MultiAgent planning. We distinguish three perspectives: first one is that of a...
Bharaneedharan Rathnasabapathy, Piotr J. Gmytrasie...
AAAI
2010
14 years 10 days ago
Reinforcement Learning via AIXI Approximation
This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. This approach is based on a direct approximation of AIXI, a Bayesian...
Joel Veness, Kee Siong Ng, Marcus Hutter, David Si...
ECML
2005
Springer
14 years 4 months ago
Model-Based Online Learning of POMDPs
Abstract. Learning to act in an unknown partially observable domain is a difficult variant of the reinforcement learning paradigm. Research in the area has focused on model-free m...
Guy Shani, Ronen I. Brafman, Solomon Eyal Shimony