We develop an algorithm for opponent modeling in large extensive-form games of imperfect information. It works by observing the opponent’s action frequencies and building an opp...
Reinforcement techniques have been successfully used to maximise the expected cumulative reward of statistical dialogue systems. Typically, reinforcement learning is used to estim...
Markov decision processes (MDPs) are widely used for modeling decision-making problems in robotics, automated control, and economics. Traditional MDPs assume that the decision mak...
—Inspired by the biological entities’ ability to achieve reciprocity in the course of evolution, this paper considers a conjecture-based distributed learning approach that enab...
Use of tags to limit partner selection for playing has been shown to produce stable cooperation in agent populations playing the Prisoner’s Dilemma game. There is, however, a lac...