Sciweavers

231 search results - page 24 / 47
» Active Learning in Partially Observable Markov Decision Proc...
Sort
View
ICMLA
2009
13 years 6 months ago
Sensitivity Analysis of POMDP Value Functions
In sequential decision making under uncertainty, as in many other modeling endeavors, researchers observe a dynamical system and collect data measuring its behavior over time. The...
Stéphane Ross, Masoumeh T. Izadi, Mark Merc...
CORR
2010
Springer
146views Education» more  CORR 2010»
13 years 8 months ago
Adaptive Submodularity: A New Approach to Active Learning and Stochastic Optimization
Solving stochastic optimization problems under partial observability, where one needs to adaptively make decisions with uncertain outcomes, is a fundamental but notoriously diffic...
Daniel Golovin, Andreas Krause
CORR
2010
Springer
105views Education» more  CORR 2010»
13 years 7 months ago
Optimism in Reinforcement Learning Based on Kullback-Leibler Divergence
We consider model-based reinforcement learning in finite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. Optimism is usually implemented by carryin...
Sarah Filippi, Olivier Cappé, Aurelien Gari...
AAAI
2004
13 years 10 months ago
Stochastic Local Search for POMDP Controllers
The search for finite-state controllers for partially observable Markov decision processes (POMDPs) is often based on approaches like gradient ascent, attractive because of their ...
Darius Braziunas, Craig Boutilier
NIPS
2001
13 years 10 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...