Search Sciweavers | Sciweavers

In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...

Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...

claim paper

Read More »

185

click to vote

CORR
2006
Springer

113views Education» more CORR 2006»

A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD

15 years 6 months ago

Download hal.inria.fr

This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...

Manuel Loth, Philippe Preux

claim paper

Read More »

147

click to vote

IDEAL
2004
Springer

94views Intelligent Agents» more IDEAL 2004»

Policy Gradient Method for Team Markov Games

15 years 11 months ago

Download www.cis.hut.fi

The main aim of this paper is to extend the single-agent policy gradient method for multiagent domains where all agents share the same utility function. We formulate these team pro...

Ville Könönen

claim paper

Read More »

172

click to vote

AAAI
2011

144views Intelligent Agents» more AAAI 2011»

Differential Eligibility Vectors for Advantage Updating and Gradient Methods

14 years 6 months ago

Download gaips.inesc-id.pt

In this paper we propose differential eligibility vectors (DEV) for temporal-difference (TD) learning, a new class of eligibility vectors designed to bring out the contribution of...

Francisco S. Melo

claim paper

Read More »

« Prev « First page 2 / 19 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers