This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled ??? ?s. We introduce ??? ?, a...
In this work, we propose a variation of a direct reinforcement learning algorithm, suitable for usage with spiking neurons based on the spike response model (SRM). The SRM is a bi...
Murilo Saraiva de Queiroz, Roberto Coelho de Berr&...
In this paper, we describe methods for e ciently computing better solutions to control problems in continuous state spaces. We provide algorithms that exploit online search to boo...
We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...
In recent years, matrix approximation for missing value prediction has emerged as an important problem in a variety of domains such as recommendation systems, e-commerce and onlin...