We apply XCS with computed prediction (XCSF) to tackle multistep reinforcement learning problems involving continuous inputs. In essence we use XCSF as a method of generalized rein...
Pier Luca Lanzi, Daniele Loiacono, Stewart W. Wils...
How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exp...
Procedural representations of control policies have two advantages when facing the scale-up problem in learning tasks. First they are implicit, with potential for inductive genera...
Abstract. In this paper, we present the results of a pilot study of a human robot interaction experiment where the rhythm of the interaction is used as a reinforcement signal to le...
Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efficiency. However, it tends to be sensitive to outliers...