Sciweavers

ESANN
2006
14 years 27 days ago
Reducing policy degradation in neuro-dynamic programming
We focus on neuro-dynamic programming methods to learn state-action value functions and outline some of the inherent problems to be faced, when performing reinforcement learning in...
Thomas Gabel, Martin Riedmiller