Abstract In this paper we address the problem of simultaneous learning and coordination in multiagent Markov decision problems (MMDPs) with infinite state-spaces. We separate this ...
In this paper, we discuss the use of Targeted Trajectory Distribution Markov Decision Processes (TTD-MDPs)—a variant of MDPs in which the goal is to realize a specified distrib...
Sooraj Bhat, David L. Roberts, Mark J. Nelson, Cha...
We consider the setting of a device that obtains it energy from a battery and some regenerative source such as a solar cell. We consider the speed scaling problem of scheduling a c...
We present two user interfaces for the interactive control of dynamically-simulated characters. The first interface uses an ‘action palette’ and targets sports prototyping ap...
Reinforcement learning problems are commonly tackled with temporal difference methods, which use dynamic programming and statistical sampling to estimate the long-term value of ta...