The execution order of a block of computer instructions on a pipelined machine can make a difference in running time by a factor of two or more. Compilers use heuristic schedulers...
In this work, we propose a variation of a direct reinforcement learning algorithm, suitable for usage with spiking neurons based on the spike response model (SRM). The SRM is a bi...
Murilo Saraiva de Queiroz, Roberto Coelho de Berr&...
Abstract— This paper describes experiments using reinforcement learning techniques to compute pattern urgencies used during simulations performed in a Monte-Carlo Go architecture...
In many practical reinforcement learning problems, the state space is too large to permit an exact representation of the value function, much less the time required to compute it. ...
Closed-loop control relies on sensory feedback that is usually assumed to be free. But if sensing incurs a cost, it may be coste ective to take sequences of actions in open-loop m...
Eric A. Hansen, Andrew G. Barto, Shlomo Zilberstei...