Learning algorithms often obtain relatively low average payoffs in repeated general-sum games between other learning agents due to a focus on myopic best-response and one-shot Nas...
One of the key problems in reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large or even continuous Markov decision processes (...
Lihong Li, Michael L. Littman, Christopher R. Mans...
In this paper, we propose a model named Logical Markov Decision Processes with Negation for Relational Reinforcement Learning for applying Reinforcement Learning algorithms on the ...
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
XCS is a learning classifier system that combines a reinforcement learning scheme with evolutionary algorithms to evolve rule sets on-line by means of the interaction with an envi...
Sergio Morales-Ortigosa, Albert Orriols-Puig, Este...