Sciweavers

231 search results - page 33 / 47
» Active Learning in Partially Observable Markov Decision Proc...
Sort
View
136
Voted
NECO
2007
150views more  NECO 2007»
15 years 3 months ago
Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule
Learning agents, whether natural or artificial, must update their internal parameters in order to improve their behavior over time. In reinforcement learning, this plasticity is ...
Dorit Baras, Ron Meir
158
Voted
RSS
2007
176views Robotics» more  RSS 2007»
15 years 5 months ago
Active Policy Learning for Robot Planning and Exploration under Uncertainty
Abstract— This paper proposes a simulation-based active policy learning algorithm for finite-horizon, partially-observed sequential decision processes. The algorithm is tested i...
Ruben Martinez-Cantin, Nando de Freitas, Arnaud Do...
126
Voted
JAIR
2008
130views more  JAIR 2008»
15 years 3 months ago
Online Planning Algorithms for POMDPs
Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving a POMDP i...
Stéphane Ross, Joelle Pineau, Sébast...
131
Voted
NN
2010
Springer
125views Neural Networks» more  NN 2010»
15 years 2 months ago
Parameter-exploring policy gradients
We present a model-free reinforcement learning method for partially observable Markov decision problems. Our method estimates a likelihood gradient by sampling directly in paramet...
Frank Sehnke, Christian Osendorfer, Thomas Rü...
131
Voted
ATAL
2007
Springer
15 years 10 months ago
Subjective approximate solutions for decentralized POMDPs
A problem of planning for cooperative teams under uncertainty is a crucial one in multiagent systems. Decentralized partially observable Markov decision processes (DECPOMDPs) prov...
Anton Chechetka, Katia P. Sycara