Search Sciweavers | Sciweavers

51 search results - page 3 / 11

» Exponentiated Gradient Methods for Reinforcement Learning

202

click to vote

ICMLA
2010

203views Machine Learning» more ICMLA 2010»

Multimodal Parameter-exploring Policy Gradients

15 years 4 months ago

Download www6.in.tum.de

Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...

Frank Sehnke, Alex Graves, Christian Osendorfer, J...

claim paper

Read More »

156

click to vote

ICML
2003
IEEE

151views Machine Learning» more ICML 2003»

Hierarchical Policy Gradient Algorithms

16 years 7 months ago

Download www.hpl.hp.com

Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning...

Mohammad Ghavamzadeh, Sridhar Mahadevan

claim paper

Read More »

179

Voted

PKDD
2009
Springer

181views Data Mining» more PKDD 2009»

Active Learning for Reward Estimation in Inverse Reinforcement Learning

16 years 1 months ago

Download users.isr.ist.utl.pt

Abstract. Inverse reinforcement learning addresses the general problem of recovering a reward function from samples of a policy provided by an expert/demonstrator. In this paper, w...

Manuel Lopes, Francisco S. Melo, Luis Montesano

claim paper

Read More »

218

click to vote

JMLR
2010

189views more JMLR 2010»

Adaptive Step-size Policy Gradients with Average Reward Metric

15 years 1 months ago

Download jmlr.csail.mit.edu

In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...

Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...

claim paper

Read More »

208

click to vote

ECML
2005
Springer

193views Machine Learning» more ECML 2005»

Natural Actor-Critic

16 years 15 days ago

Download www-clmc.usc.edu

This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing Amari...

Jan Peters, Sethu Vijayakumar, Stefan Schaal

claim paper

Read More »

« Prev « First page 3 / 11 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers