Sciweavers

97 search results - page 18 / 20
» An epsilon-Optimal Grid-Based Algorithm for Partially Observ...
Sort
View
HICSS
2003
IEEE
207views Biometrics» more  HICSS 2003»
14 years 23 days ago
Formalizing Multi-Agent POMDP's in the context of network routing
This paper uses partially observable Markov decision processes (POMDP’s) as a basic framework for MultiAgent planning. We distinguish three perspectives: first one is that of a...
Bharaneedharan Rathnasabapathy, Piotr J. Gmytrasie...
ECML
2007
Springer
14 years 1 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber
ICML
2006
IEEE
14 years 8 months ago
An analytic solution to discrete Bayesian reinforcement learning
Reinforcement learning (RL) was originally proposed as a framework to allow agents to learn in an online fashion as they interact with their environment. Existing RL algorithms co...
Pascal Poupart, Nikos A. Vlassis, Jesse Hoey, Kevi...
NECO
2007
150views more  NECO 2007»
13 years 7 months ago
Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule
Learning agents, whether natural or artificial, must update their internal parameters in order to improve their behavior over time. In reinforcement learning, this plasticity is ...
Dorit Baras, Ron Meir
FSR
2003
Springer
94views Robotics» more  FSR 2003»
14 years 21 days ago
Planning under Uncertainty for Reliable Health Care Robotics
We describe a mobile robot system, designed to assist residents of an retirement facility. This system is being developed to respond to an aging population and a predicted shortage...
Nicholas Roy, Geoffrey J. Gordon, Sebastian Thrun