Search Sciweavers | Sciweavers

473 search results - page 28 / 95

» Optimal policy switching algorithms for reinforcement learni...

145

click to vote

ICML
2009
IEEE

131views Machine Learning» more ICML 2009»

Monte-Carlo simulation balancing

16 years 6 months ago

Download www.cs.ualberta.ca

In this paper we introduce the first algorithms for efficiently learning a simulation policy for Monte-Carlo search. Our main idea is to optimise the balance of a simulation polic...

David Silver, Gerald Tesauro

claim paper

Read More »

159

click to vote

ATAL
2008
Springer

123views Intelligent Agents» more ATAL 2008»

Sigma point policy iteration

15 years 8 months ago

Download web.mit.edu

In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...

Michael H. Bowling, Alborz Geramifard, David Winga...

claim paper

Read More »

173

click to vote

ICMLA
2010

203views Machine Learning» more ICMLA 2010»

Multimodal Parameter-exploring Policy Gradients

15 years 3 months ago

Download www6.in.tum.de

Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...

Frank Sehnke, Alex Graves, Christian Osendorfer, J...

claim paper

Read More »

174

click to vote

ICRA
2010
IEEE

133views Robotics» more ICRA 2010»

Generalized model learning for Reinforcement Learning on a humanoid robot

15 years 4 months ago

Download www.cs.utexas.edu

— Reinforcement learning (RL) algorithms have long been promising methods for enabling an autonomous robot to improve its behavior on sequential decision-making tasks. The obviou...

Todd Hester, Michael Quinlan, Peter Stone

claim paper

Read More »

148

click to vote

ECAI
2008
Springer

124views Artificial Intelligence» more ECAI 2008»

Exploiting locality of interactions using a policy-gradient approach in multiagent learning

15 years 7 months ago

Download gaips.inesc-id.pt

In this paper, we propose a policy gradient reinforcement learning algorithm to address transition-independent Dec-POMDPs. This approach aims at implicitly exploiting the locality...

Francisco S. Melo

claim paper

Read More »

« Prev « First page 28 / 95 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers