Abstract— This paper proposes a simulation-based active policy learning algorithm for finite-horizon, partially-observed sequential decision processes. The algorithm is tested i...
Ruben Martinez-Cantin, Nando de Freitas, Arnaud Do...
Abstract—We analyze the performance limits of data dissemination with multi-channel, single radio sensors. We formulate the problem of minimizing the average delay of data dissem...
David Starobinski, Weiyao Xiao, Xiangping Qin, Ari...
Reinforcement Learning is a commonly used technique in robotics, however, traditional algorithms are unable to handle large amounts of data coming from the robot’s sensors, requi...
Learning to act in a multiagent environment is a difficult problem since the normal definition of an optimal policy no longer applies. The optimal policy at any moment depends on ...
Reinforcement learning algorithms that employ neural networks as function approximators have proven to be powerful tools for solving optimal control problems. However, their traini...