In this paper, we address the tradeo between exploration and exploitation for agents which need to learn more about the structure of their environment in order to perform more e e...
Shlomo Argamon-Engelson, Sarit Kraus, Sigalit Sina
This paper describes an algorithm, called CQ-learning, which learns to adapt the state representation for multi-agent systems in order to coordinate with other agents. We propose ...
In this paper, we present a novel multi-agent learning paradigm called team-partitioned, opaque-transition reinforcement learning (TPOT-RL). TPOT-RL introduces the concept of usin...
Learning, planning, and representing knowledge in large state t multiple levels of temporal abstraction are key, long-standing challenges for building flexible autonomous agents. ...
Temporal difference methods are theoretically grounded and empirically effective methods for addressing reinforcement learning problems. In most real-world reinforcement learning ...