Sciweavers

ICML
2000
IEEE

Convergence Problems of General-Sum Multiagent Reinforcement Learning

15 years 1 months ago
Convergence Problems of General-Sum Multiagent Reinforcement Learning
Stochastic games are a generalization of MDPs to multiple agents, and can be used as a framework for investigating multiagent learning. Hu and Wellman (1998) recently proposed a multiagent Q-learning method for general-sum stochastic games. In addition to describing the algorithm, they provide a proof that the method will converge to a Nash equilibrium for the game under specified conditions. The convergence depends on a lemma stating that the iteration used by this method is a contraction mapping. Unfortunately the proof is incomplete. In this paper we present a counterexample and flaw to the lemma's proof. We also introduce strengthened assumptions under which the lemma holds, and examine how this affects the classes of games to which the theoretical result can be applied.
Michael H. Bowling
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2000
Where ICML
Authors Michael H. Bowling
Comments (0)