We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...
—Large-scale agent-based systems are required to self-optimize towards multiple, potentially conflicting, policies of varying spatial and temporal scope. As a result, not all ag...
In Multi-Agent learning, agents must learn to select actions that maximize their utility given the action choices of the other agents. Cooperative Coevolution offers a way to evol...
— The goal of this research is to enable mobile robots to navigate through crowded environments such as indoor shopping malls, airports, or downtown side walks. The key research ...
Peter Henry, Christian Vollmer, Brian Ferris, Diet...
Recommender systems are widely used to cope with the problem of information overload and, consequently, many recommendation methods have been developed. However, no one technique i...