Sciweavers

27 search results - page 1 / 6
» Policy Gradient Method for Team Markov Games
Sort
View
IDEAL
2004
Springer
15 years 7 months ago
Policy Gradient Method for Team Markov Games
The main aim of this paper is to extend the single-agent policy gradient method for multiagent domains where all agents share the same utility function. We formulate these team pro...
Ville Könönen
JMLR
2010
189views more  JMLR 2010»
14 years 9 months ago
Adaptive Step-size Policy Gradients with Average Reward Metric
In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...
Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...
NIPS
2003
15 years 3 months ago
Distributed Optimization in Adaptive Networks
We develop a protocol for optimizing dynamic behavior of a network of simple electronic components, such as a sensor network, an ad hoc network of mobile devices, or a network of ...
Ciamac Cyrus Moallemi, Benjamin Van Roy
ATAL
2008
Springer
15 years 4 months ago
Emerging coordination in infinite team Markov games
In this paper we address the problem of coordination in multi-agent sequential decision problems with infinite statespaces. We adopt a game theoretic formalism to describe the int...
Francisco S. Melo, M. Isabel Ribeiro
118
Voted
JMLR
2006
143views more  JMLR 2006»
15 years 2 months ago
Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation
We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...
Rémi Munos