Sciweavers

169 search results - page 20 / 34
» Planning with Continuous Actions in Partially Observable Env...
Sort
View
ATAL
2006
Springer
13 years 11 months ago
Decentralized planning under uncertainty for teams of communicating agents
Decentralized partially observable Markov decision processes (DEC-POMDPs) form a general framework for planning for groups of cooperating agents that inhabit a stochastic and part...
Matthijs T. J. Spaan, Geoffrey J. Gordon, Nikos A....
IJCAI
2003
13 years 9 months ago
Logical Filtering
Filtering denotes any method whereby an agent updates its belief state—its knowledge of the state of the world—from a sequence of actions and observations. In logical filterin...
Eyal Amir, Stuart J. Russell
ATAL
2007
Springer
14 years 1 months ago
Real-time agent characterization and prediction
Reasoning about agents that we observe in the world is challenging. Our available information is often limited to observations of the agent’s external behavior in the past and p...
H. Van Dyke Parunak, Sven Brueckner, Robert S. Mat...
ECML
2007
Springer
14 years 1 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber
IPPS
2009
IEEE
14 years 2 months ago
Crash fault detection in celerating environments
Failure detectors are a service that provides (approximate) information about process crashes in a distributed system. The well-known “eventually perfect” failure detector, 3P...
Srikanth Sastry, Scott M. Pike, Jennifer L. Welch