Agents engaged in noncooperative interaction may seek to achieve a Nash equilibrium; this requires that agents be aware of others’ rewards. Misinformation about rewards leads to...
The values of a two-player zero-sum binary discounted game are characterized by a P-matrix linear complementarity problem (LCP). Simple formulas are given to describe the data of t...
We introduce the ALeRT (Action-dependent Learning Rates with Trends) algorithm that makes two modifications to the learning rate and one change to the exploration rate of traditio...
Maria Cutumisu, Duane Szafron, Michael H. Bowling,...
Modeling the behavior of imperfect agents from a small number of observations is a difficult, but important task. In the singleagent decision-theoretic setting, inverse optimal co...
Stochastic games generalize Markov decision processes MDPs to a multiagent setting by allowing the state transitions to depend jointly on all player actions, and having rewards de...
Michael J. Kearns, Yishay Mansour, Satinder P. Sin...