This site uses cookies to deliver our services and to ensure you get the best experience. By continuing to use this site, you consent to our use of cookies and acknowledge that you have read and understand our Privacy Policy, Cookie Policy, and Terms
This paper analyzes the complexity of on-line reinforcement learning algorithms, namely asynchronous realtime versions of Q-learning and value-iteration, applied to the problem of...
We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geom...
Successful application of reinforcement learning algorithms often involves considerable hand-crafting of the necessary non-linear features to reduce the complexity of the value fu...
In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...
Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...
Due to the unavoidable fact that a robot’s sensors will be limited in some manner, it is entirely possible that it can find itself unable to distinguish between differing state...
This paper is concerned with how multi-agent reinforcement learning algorithms can practically be applied to real-life problems. Recently, a new coordinated multi-agent exploratio...
Abstract. In this paper we compare state-of-the-art multi-agent reinforcement learning algorithms in a wide variety of games. We consider two types of algorithms: value iteration a...
H. Jaap van den Herik, Daniel Hennes, Michael Kais...
It is known that the complexity of the reinforcement learning algorithms, such as Q-learning, may be exponential in the number of environment’s states. It was shown, however, th...
Temporal difference reinforcement learning algorithms are perfectly suited to autonomous agents because they learn directly from an agent’s experience based on sequential actio...
Reinforcement learning is a popular and successful framework for many agent-related problems because only limited environmental feedback is necessary for learning. While many algo...