Sciweavers

91 search results - page 2 / 19
» Parameter-exploring policy gradients
Sort
View
IGPL
2010
83views more  IGPL 2010»
13 years 6 months ago
Recurrent policy gradients
Daan Wierstra, Alexander Förster, Jan Peters,...
JMLR
2010
189views more  JMLR 2010»
13 years 2 months ago
Adaptive Step-size Policy Gradients with Average Reward Metric
In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...
Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...
CORR
2006
Springer
113views Education» more  CORR 2006»
13 years 7 months ago
A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD
This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...
Manuel Loth, Philippe Preux
IDEAL
2004
Springer
14 years 1 months ago
Policy Gradient Method for Team Markov Games
The main aim of this paper is to extend the single-agent policy gradient method for multiagent domains where all agents share the same utility function. We formulate these team pro...
Ville Könönen
AAAI
2011
12 years 7 months ago
Differential Eligibility Vectors for Advantage Updating and Gradient Methods
In this paper we propose differential eligibility vectors (DEV) for temporal-difference (TD) learning, a new class of eligibility vectors designed to bring out the contribution of...
Francisco S. Melo