Search Sciweavers | Sciweavers

231 search results - page 24 / 47

» Active Learning in Partially Observable Markov Decision Proc...

click to vote

ICMLA
2009

181views Machine Learning» more ICMLA 2009»

Sensitivity Analysis of POMDP Value Functions

13 years 6 months ago

Download www.cs.cmu.edu

In sequential decision making under uncertainty, as in many other modeling endeavors, researchers observe a dynamical system and collect data measuring its behavior over time. The...

Stéphane Ross, Masoumeh T. Izadi, Mark Merc...

claim paper

Read More »

click to vote

CORR
2010
Springer

146views Education» more CORR 2010»

Adaptive Submodularity: A New Approach to Active Learning and Stochastic Optimization

13 years 8 months ago

Download www.cs.caltech.edu

Solving stochastic optimization problems under partial observability, where one needs to adaptively make decisions with uncertain outcomes, is a fundamental but notoriously diffic...

Daniel Golovin, Andreas Krause

claim paper

Read More »

click to vote

CORR
2010
Springer

105views Education» more CORR 2010»

Optimism in Reinforcement Learning Based on Kullback-Leibler Divergence

13 years 7 months ago

Download hal.archives-ouvertes.fr

We consider model-based reinforcement learning in ﬁnite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. Optimism is usually implemented by carryin...

Sarah Filippi, Olivier Cappé, Aurelien Gari...

claim paper

Read More »

click to vote

AAAI
2004

103views Intelligent Agents» more AAAI 2004»

Stochastic Local Search for POMDP Controllers

13 years 10 months ago

Download www.cs.utoronto.ca

The search for finite-state controllers for partially observable Markov decision processes (POMDPs) is often based on approaches like gradient ascent, attractive because of their ...

Darius Braziunas, Craig Boutilier

claim paper

Read More »

click to vote

NIPS
2001

144views Information Technology» more NIPS 2001»

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

13 years 10 months ago

Download jmlr.csail.mit.edu

Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...

Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...

claim paper

Read More »

« Prev « First page 24 / 47 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers