Abstract. In order to establish autonomous behavior for technical systems, the well known trade-off between reactive control and deliberative planning has to be considered. Within ...
In this paper, we investigate the use of hierarchical reinforcement learning (HRL) to speed up the acquisition of cooperative multi-agent tasks. We introduce a hierarchical multi-a...
Rajbala Makar, Sridhar Mahadevan, Mohammad Ghavamz...
We present the first real-world benchmark for sequentiallyoptimal team formation, working within the framework of a class of online football prediction games known as Fantasy Foo...
Tim Matthews, Sarvapali D. Ramchurn, Georgios Chal...
This paper describes Icarus, an agent architecture that embeds a hierarchical reinforcement learning algorithm within a language for specifying agent behavior. An Icarus program e...
— Partially Observable Markov Decision Processes (POMDPs) provide a rich mathematical model to handle realworld sequential decision processes but require a known model to be solv...