In this paper, we investigate multi-agent learning (MAL) in a multi-agent resource selection problem (MARS) in which a large group of agents are competing for common resources. Since agents in such a setting are self-interested, MAL in MARS domains typically focuses on the convergence to a set of non-cooperative equilibria. As seen in the example of prisoner's dilemma, however, selfish equilibria are not necessarily optimal with respect to the natural objective function of a target problem, e.g., resource utilization in the case of MARS. Conversely, a centrally administered optimization of physically distributed agents is infeasible in many reallife applications such as transportation traffic problems. In order to explore the possibility for a middle ground solution, we analyze two types of costs for evaluating MAL algorithms in this context. The quality loss of a selfish algorithm can be quantitatively measured by the price of anarchy, i.e., the ratio of the objective function v...
Jean Oh, Stephen F. Smith