In this paper we apply the recent notion of anytime universal intelligence tests to the evaluation of a popular reinforcement learning algorithm, Q-learning. We show that a general...
R-max is a very simple model-based reinforcement learning algorithm which can attain near-optimal average reward in polynomial time. In R-max, the agent always maintains a complet...
The paper presents a new reinforcement learning mechanism for spiking neural networks. The algorithm is derived for networks of stochastic integrate-and-fire neurons, but it can ...
Several multiagent reinforcement learning (MARL) algorithms have been proposed to optimize agents' decisions. Only a subset of these MARL algorithms both do not require agent...
A key component of any reinforcement learning algorithm is the underlying representation used by the agent. While reinforcement learning (RL) agents have typically relied on hand-...