Motivated by a machine learning perspective—that gametheoretic equilibria constraints should serve as guidelines for predicting agents’ strategies, we introduce maximum causal...
We study how to learn to play a Pareto-optimal strict Nash equilibrium when there exist multiple equilibria and agents may have different preferences among the equilibria. We focu...
In the Markov decision process (MDP) formalization of reinforcement learning, a single adaptive agent interacts with an environment defined by a probabilistic transition function....
Reinforcement learning has been used for training game playing agents. The value function for a complex game must be approximated with a continuous function because the number of ...
Cooperative games are those in which both agents share the same payoff structure. Valuebased reinforcement-learning algorithms, such as variants of Q-learning, have been applied t...
Leonid Peshkin, Kee-Eung Kim, Nicolas Meuleau, Les...