Sciweavers

231 search results - page 33 / 47
» Active Learning in Partially Observable Markov Decision Proc...
Sort
View
NECO
2007
150views more  NECO 2007»
13 years 7 months ago
Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule
Learning agents, whether natural or artificial, must update their internal parameters in order to improve their behavior over time. In reinforcement learning, this plasticity is ...
Dorit Baras, Ron Meir
RSS
2007
176views Robotics» more  RSS 2007»
13 years 9 months ago
Active Policy Learning for Robot Planning and Exploration under Uncertainty
Abstract— This paper proposes a simulation-based active policy learning algorithm for finite-horizon, partially-observed sequential decision processes. The algorithm is tested i...
Ruben Martinez-Cantin, Nando de Freitas, Arnaud Do...
JAIR
2008
130views more  JAIR 2008»
13 years 7 months ago
Online Planning Algorithms for POMDPs
Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving a POMDP i...
Stéphane Ross, Joelle Pineau, Sébast...
NN
2010
Springer
125views Neural Networks» more  NN 2010»
13 years 5 months ago
Parameter-exploring policy gradients
We present a model-free reinforcement learning method for partially observable Markov decision problems. Our method estimates a likelihood gradient by sampling directly in paramet...
Frank Sehnke, Christian Osendorfer, Thomas Rü...
ATAL
2007
Springer
14 years 1 months ago
Subjective approximate solutions for decentralized POMDPs
A problem of planning for cooperative teams under uncertainty is a crucial one in multiagent systems. Decentralized partially observable Markov decision processes (DECPOMDPs) prov...
Anton Chechetka, Katia P. Sycara