We present metric?? , a provably near-optimal algorithm for reinforcement learning in Markov decision processes in which there is a natural metric on the state space that allows t...
In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...
Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...
CBR is one of the techniques that can be applied to the task of approximating a function over high-dimensional, continuous spaces. In Reinforcement Learning systems a learning agen...
This paper introduces a multiagent reinforcement learning algorithm that converges with a given accuracy to stationary Nash equilibria in general-sum discounted stochastic games. ...
This review considers the theoretical problems facing agents that must learn and choose on the basis of reward or reinforcement that is uncertain or delayed, in implicit or proced...