Sciweavers

NIPS
2003
14 years 24 days ago
Policy Search by Dynamic Programming
We consider the policy search approach to reinforcement learning. We show that if a “baseline distribution” is given (indicating roughly how often we expect a good policy to v...
J. Andrew Bagnell, Sham Kakade, Andrew Y. Ng, Jeff...