Reaching pareto-optimality in prisoner's dilemma using conditional joint action learning

15 years 7 months ago

Download euler.mcs.utulsa.edu

We consider a repeated Prisoner’s Dilemma game where two independent learning agents play against each other. We assume that the players can observe each others’ action but are oblivious to the payoff received by the other player. Multiagent learning literature has provided mechanisms that allow agents to converge to Nash Equilibrium. In this paper we deﬁne a special class of learner called a conditional joint action learner (CJAL) who attempts to learn the conditional probability of an action taken by the other given its own action and uses it to decide its next course of action. We prove that when played against itself, if the payoff structure of Prisoner’s Dilemma game satisﬁes certain conditions, using a limited exploration technique these agents can actually learn to converge to the Pareto optimal solution that dominates the Nash Equilibrium, while maintaining individual rationality. We analytically derive the conditions for which such a phenomenon can occur and have sh...

Dipyaman Banerjee, Sandip Sen

Real-time Traffic

AAMAS 2007 | Conditional Joint Action | Intelligent Agents | Nash Equilibrium | Prisoner’s Dilemma Game |

claim paper

Post Info
More Details (n/a)

Added	08 Dec 2010
Updated	08 Dec 2010
Type	Journal
Year	2007
Where	AAMAS
Authors	Dipyaman Banerjee, Sandip Sen

Comments (0)

Sciweavers

Reaching pareto-optimality in prisoner's dilemma using conditional joint action learning

AAMAS 2007 | Conditional Joint Action | Intelligent Agents | Nash Equilibrium | Prisoner’s Dilemma Game |

Explore & Download

Productivity Tools

Sciweavers