The persistent modification of synaptic efficacy as a function of the relative timing of pre- and postsynaptic spikes is a phenomenon known as spiketiming-dependent plasticity (...
The success ofreinforcement learninginpractical problems depends on the ability to combine function approximation with temporal di erence methods such as value iteration. Experime...
We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems with large state spaces. Our algorithm uses temporal ...
Many interesting problems, such as power grids, network switches, and tra c ow, that are candidates for solving with reinforcement learningRL, alsohave properties that make distri...
Jeff G. Schneider, Weng-Keen Wong, Andrew W. Moore...
This paper investigates how to make improved action selection for online policy learning in robotic scenarios using reinforcement learning (RL) algorithms. Since finding control po...
Reinaldo A. C. Bianchi, Carlos H. C. Ribeiro, Anna...