The covariance matrix adaptation evolution strategy (CMAES) has proven to be a powerful method for reinforcement learning (RL). Recently, the CMA-ES has been augmented with an ada...
Policy teaching considers a Markov Decision Process setting in which an interested party aims to influence an agent’s decisions by providing limited incentives. In this paper, ...
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
Decentralized partially observable Markov decision processes (Dec-POMDPs) constitute an expressive framework for multiagent planning under uncertainty, but solving them is provabl...
Frans A. Oliehoek, Matthijs T. J. Spaan, Shimon Wh...
We present a new algorithm, called incremental least squares policy iteration (ILSPI), for finding the infinite-horizon stationary policy for partially observable Markov decision ...