Given a set of observed economic choices, can one infer preferences and/or utility functions for the players that are consistent with the data? Questions of this type are called r...
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Coevolution has often been based on averaged outcomes, resulting in unstable evaluation. Several theoretical approaches have used archives to provide stable evaluation. However, t...
We address the problem of learning the parameters in graphical models when inference is intractable. A common strategy in this case is to replace the partition function with its B...
Finding approximate Nash equilibria in n × n bimatrix games is currently one of the main open problems in algorithmic game theory. Motivated in part by the lack of progress on wo...