Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...
We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geom...
We address two open theoretical questions in Policy Gradient Reinforcement Learning. The first concerns the efficacy of using function approximation to represent the state action ...
In this paper, we propose a policy gradient reinforcement learning algorithm to address transition-independent Dec-POMDPs. This approach aims at implicitly exploiting the locality...