Table 1 shows the payoff to player one. The same matrix also holds for player two. Player one can gain the maximum 5 points (T = 5) by defection if player two cooperates. However,...
Reward shaping is a well-known technique applied to help reinforcement-learning agents converge more quickly to nearoptimal behavior. In this paper, we introduce social reward sha...
Monica Babes, Enrique Munoz de Cote, Michael L. Li...
We discuss how the interaction between spam senders and e-mail users can be modelled as a two-player adversary game. We show how the resulting model can be used to predict the str...
Ion Androutsopoulos, Evangelos F. Magirou, Dimitri...
The emergence of Grim Trigger as the dominant strategy in the Iterated Prisoner Dilemma (IPD) on a square lattice is investigated for players with finite memory, using three differ...
Abstract. There is a growing research interest in the design of competitive and adaptive Game AI for complex computer strategy games. In this paper, we present a novel approach for...