Search Sciweavers | Sciweavers

27 search results - page 1 / 6

» Policy Gradient Method for Team Markov Games

161

click to vote

IDEAL
2004
Springer

94views Intelligent Agents» more IDEAL 2004»

Policy Gradient Method for Team Markov Games

16 years 12 days ago

Download www.cis.hut.fi

The main aim of this paper is to extend the single-agent policy gradient method for multiagent domains where all agents share the same utility function. We formulate these team pro...

Ville Könönen

claim paper

Read More »

226

click to vote

JMLR
2010

189views more JMLR 2010»

Adaptive Step-size Policy Gradients with Average Reward Metric

15 years 1 months ago

Download jmlr.csail.mit.edu

In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...

Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...

claim paper

Read More »

173

click to vote

NIPS
2003

128views Information Technology» more NIPS 2003»

Distributed Optimization in Adaptive Networks

15 years 8 months ago

Download books.nips.cc

We develop a protocol for optimizing dynamic behavior of a network of simple electronic components, such as a sensor network, an ad hoc network of mobile devices, or a network of ...

Ciamac Cyrus Moallemi, Benjamin Van Roy

claim paper

Read More »

189

click to vote

ATAL
2008
Springer

134views Intelligent Agents» more ATAL 2008»

Emerging coordination in infinite team Markov games

15 years 9 months ago

Download gaips.inesc-id.pt

In this paper we address the problem of coordination in multi-agent sequential decision problems with infinite statespaces. We adopt a game theoretic formalism to describe the int...

Francisco S. Melo, M. Isabel Ribeiro

claim paper

Read More »

180

click to vote

JMLR
2006

143views more JMLR 2006»

Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation

15 years 7 months ago

Download www.aaai.org

We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...

Rémi Munos

claim paper

Read More »

« Prev « First page 1 / 6 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers