Theoretical advantages of lenient Q-learners: an evolutionary game theoretic perspective

16 years 2 months ago

Download www.aamas-conference.org

This paper presents the dynamics of multiple reinforcement learning agents from an Evolutionary Game Theoretic (EGT) perspective. We provide a Replicator Dynamics model for traditional multiagent Q-learning, and we extend these differential equations to account for lenient learners: agents that forgive possible mistakes of their teammates that resulted in lower rewards. We use this extended formal model to visualize the basins of attraction of both traditional and lenient multiagent Q-learners in two benchmark coordination problems. The results indicate that lenience provides learners with more accurate estimates for the utility of their actions, resulting in higher likelihood of convergence to the globally optimal solution. In addition, our research supports the strength of EGT as a backbone for multiagent reinforcement learning.

Liviu Panait, Karl Tuyls

Real-time Traffic

ATAL 2007 | Lenient Multiagent Q-learners | Multiagent | Reinforcement Learning |

claim paper

Post Info
More Details (n/a)

Added	07 Jun 2010
Updated	07 Jun 2010
Type	Conference
Year	2007
Where	ATAL
Authors	Liviu Panait, Karl Tuyls

Comments (0)

Sciweavers

Theoretical advantages of lenient Q-learners: an evolutionary game theoretic perspective

ATAL 2007 | Lenient Multiagent Q-learners | Multiagent | Reinforcement Learning |

Explore & Download

Productivity Tools

Sciweavers