Sciweavers

ICML
1994
IEEE

Markov Games as a Framework for Multi-Agent Reinforcement Learning

14 years 4 months ago
Markov Games as a Framework for Multi-Agent Reinforcement Learning
In the Markov decision process (MDP) formalization of reinforcement learning, a single adaptive agent interacts with an environment defined by a probabilistic transition function. In this solipsistic view, secondary agents can only be part of the environment and are therefore fixed in their behavior. The framework of Markov games allows us to widen this view to include multiple adaptive agents with interacting or competing goals. This paper considers a step in this direction in which exactly two agents with diametrically opposed goals share an environment. It describes a Q-learning-like algorithm for finding optimal policies and demonstrates itsapplicationto a simple two-player game in which the optimal policy is probabilistic.
Michael L. Littman
Added 27 Aug 2010
Updated 27 Aug 2010
Type Conference
Year 1994
Where ICML
Authors Michael L. Littman
Comments (0)