We address two open theoretical questions in Policy Gradient Reinforcement Learning. The first concerns the efficacy of using function approximation to represent the state action ...
Due to the non-stationary environment, learning in multi-agent systems is a challenging problem. This paper first introduces a new gradient-based learning algorithm, augmenting th...