We present BL-WoLF, a framework for learnability in repeated zero-sum games where the cost of learning is measured by the losses the learning agent accrues (rather than the number...
Quite a bit is known about minimizing different kinds of regret in experts problems, and how these regret types relate to types of equilibria in the multiagent setting of repeated...
How humans infer probable information from the limited observed data? How they are able to build on little knowledge about the context in hand? Is the human memory repeatedly const...
R-max is a very simple model-based reinforcement learning algorithm which can attain near-optimal average reward in polynomial time. In R-max, the agent always maintains a complet...
We study how to learn to play a Pareto-optimal strict Nash equilibrium when there exist multiple equilibria and agents may have different preferences among the equilibria. We focu...