Temporal difference reinforcement learning algorithms are perfectly suited to autonomous agents because they learn directly from an agent’s experience based on sequential actio...
Abstract. The goal of the APOSDLE (Advanced Process-Oriented SelfDirected Learning environment) project is to support work-integrated learning of knowledge workers. We argue that w...
Stefanie N. Lindstaedt, Peter Scheir, Armin Ulbric...
In this paper, we investigate the use of hierarchical reinforcement learning (HRL) to speed up the acquisition of cooperative multi-agent tasks. We introduce a hierarchical multi-a...
Rajbala Makar, Sridhar Mahadevan, Mohammad Ghavamz...
Classically, an approach to the multiagent policy learning supposed that the agents, via interactions and/or by using preliminary knowledge about the reward functions of all playe...
Physical domains are notoriously hard to model completely and correctly, especially to capture the dynamics of the environment. Moreover, since environments change, it is even mor...