Sciweavers

UAI
2000

Learning to Cooperate via Policy Search

14 years 26 days ago
Learning to Cooperate via Policy Search
Cooperative games are those in which both agents share the same payoff structure. Valuebased reinforcement-learning algorithms, such as variants of Q-learning, have been applied to learning cooperative games, but they only apply when the game state is completely observable to both agents. Policy search methods are a reasonable alternative to value-based methods for partially observable environments. In this paper, we provide a gradient-based distributed policysearch method for cooperative games and compare the notion of local optimum to that of Nash equilibrium. We demonstrate the effectiveness of this method experimentally in a small, partially observable simulated soccer domain.
Leonid Peshkin, Kee-Eung Kim, Nicolas Meuleau, Les
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2000
Where UAI
Authors Leonid Peshkin, Kee-Eung Kim, Nicolas Meuleau, Leslie Pack Kaelbling
Comments (0)