In the Markov decision process (MDP) formalization of reinforcement learning, a single adaptive agent interacts with an environment defined by a probabilistic transition function....
Abstract. Infinite-horizon multi-agent control processes with nondeterminism and partial state knowledge have particularly interesting properties with respect to adaptive control, ...
We describe a generalized Q-learning type algorithm for reinforcement learning in competitive multi-agent games. We make the observation that in a competitive setting with adaptive...
Pieter Jan't Hoen, Sander M. Bohte, Han La Poutr&e...
In this paper, we describe how certain aspects of the biological phenomena of stigmergy can be imported into multiagent reinforcement learning (MARL), with the purpose of better e...