In some environments, a learning agent must learn to balance competing objectives. For example, a Q-learner agent may need to learn which choices expose the agent to risk and whic...
: The heterogeneity of device capabilities, network conditions and user contexts that is associated with mobile computing has emphasized the need for more advanced forms of adaptat...
We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Simi...
Online mechanism design (OMD) addresses the problem of sequential decision making in a stochastic environment with multiple self-interested agents. The goal in OMD is to make valu...
David C. Parkes, Satinder P. Singh, Dimah Yanovsky
Wafers in a 300-mm semiconductor fabrication facility are transported throughout the factory in carriers called front opening unified pods (FOUPs). Two standard capacities of FOUP...