Sciweavers

NIPS
2001

Rates of Convergence of Performance Gradient Estimates Using Function Approximation and Bias in Reinforcement Learning

14 years 1 months ago
Rates of Convergence of Performance Gradient Estimates Using Function Approximation and Bias in Reinforcement Learning
We address two open theoretical questions in Policy Gradient Reinforcement Learning. The first concerns the efficacy of using function approximation to represent the state action value function, . Theory is presented showing that linear function approximation representations of can degrade the rate of convergence of performance gradient estimates by a factor of
Gregory Z. Grudic, Lyle H. Ungar
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2001
Where NIPS
Authors Gregory Z. Grudic, Lyle H. Ungar
Comments (0)