A new training algorithm is presented for delayed reinforcement learning problems that does not assume the existence of a critic model and employs the polytope optimization algorit...
We focus on neuro-dynamic programming methods to learn state-action value functions and outline some of the inherent problems to be faced, when performing reinforcement learning in...
This paper demonstrates the applicability of reinforcement learning for first person shooter bot artificial intelligence. Reinforcement learning is a machine learning technique wh...
We propose a modular reinforcement learning architecture for non-linear, nonstationary control tasks, which we call multiple model-based reinforcement learning (MMRL). The basic i...
A gradient-based method for both symmetric and asymmetric multiagent reinforcement learning is introduced in this paper. Symmetric multiagent reinforcement learning addresses the ...