We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
We present several new algorithms for multiagent reinforcement learning. A common feature of these algorithms is a parameterized, structured representation of a policy or value fu...
Carlos Guestrin, Michail G. Lagoudakis, Ronald Par...
In this paper, we present a reinforcement learning approach for mapping natural language instructions to sequences of executable actions. We assume access to a reward function tha...
S. R. K. Branavan, Harr Chen, Luke S. Zettlemoyer,...
Abstract— Continuous action sets are used in many reinforcement learning (RL) applications in robot control since the control input is continuous. However, discrete action sets a...
Akihiko Yamaguchi, Jun Takamatsu, Tsukasa Ogasawar...