Abstract— This paper proposes a simulation-based active policy learning algorithm for finite-horizon, partially-observed sequential decision processes. The algorithm is tested i...
Ruben Martinez-Cantin, Nando de Freitas, Arnaud Do...
Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...
While it is well-known that model can enhance the control performance in terms of precision or energy efficiency, the practical application has often been limited by the complexiti...
Duy Nguyen-Tuong, Jan Peters, Matthias Seeger, Ber...
Abstract— Cognitive control - the ability to produce appropriate behavior in complex situations - is a fundamental aspect of intelligence. It is increasingly evident that this co...
We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...