Search Sciweavers | Sciweavers

97 search results - page 16 / 20

» An epsilon-Optimal Grid-Based Algorithm for Partially Observ...

157

click to vote

GECCO
2009
Springer

162views Optimization» more GECCO 2009»

Uncertainty handling CMA-ES for reinforcement learning

15 years 1 months ago

Download www.neuroinformatik.ruhr-uni-bochum.de

The covariance matrix adaptation evolution strategy (CMAES) has proven to be a powerful method for reinforcement learning (RL). Recently, the CMA-ES has been augmented with an ada...

Verena Heidrich-Meisner, Christian Igel

claim paper

Read More »

117

click to vote

SIGECOM
2009
ACM

114views ECommerce» more SIGECOM 2009»

Policy teaching through reward function learning

15 years 10 months ago

Download www.eecs.harvard.edu

Policy teaching considers a Markov Decision Process setting in which an interested party aims to inﬂuence an agent’s decisions by providing limited incentives. In this paper, ...

Haoqi Zhang, David C. Parkes, Yiling Chen

claim paper

Read More »

129

click to vote

NIPS
2001

144views Information Technology» more NIPS 2001»

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

15 years 5 months ago

Download jmlr.csail.mit.edu

Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...

Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...

claim paper

Read More »

150

click to vote

ATAL
2008
Springer

147views Intelligent Agents» more ATAL 2008»

Exploiting locality of interaction in factored Dec-POMDPs

15 years 5 months ago

Download www.aamas-conference.org

Decentralized partially observable Markov decision processes (Dec-POMDPs) constitute an expressive framework for multiagent planning under uncertainty, but solving them is provabl...

Frans A. Oliehoek, Matthijs T. J. Spaan, Shimon Wh...

claim paper

Read More »

132

click to vote

AAAI
2006

146views Intelligent Agents» more AAAI 2006»

Incremental Least Squares Policy Iteration for POMDPs

15 years 5 months ago

Download www.aaai.org

We present a new algorithm, called incremental least squares policy iteration (ILSPI), for finding the infinite-horizon stationary policy for partially observable Markov decision ...

Hui Li, Xuejun Liao, Lawrence Carin

claim paper

Read More »

« Prev « First page 16 / 20 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers