Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning...
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
A simultaneous perturbation stochastic approximation (SPSA) method has been developed in this paper, using the operators of perturbation with the Lipschitz density function. This ...
—In this paper, we present two methods for accurate gradient estimation from scalar field data sampled on regular lattices. The first method is based on the multidimensional Tayl...
This paper introduces a new method to maximize kurtosisbased contrast functions. Such contrast functions appear in the problem of blind source separation of convolutively mixed so...