Learning when to stop thinking and do something!

15 years 3 months ago

Download www.cs.ualberta.ca

An anytime algorithm is capable of returning a response to the given task at essentially any time; typically the quality of the response improves as the time increases. Here, we consider the challenge of learning when we should terminate such algorithms on each of a sequence of iid tasks, to optimize the expected average reward per unit time. We provide a system for addressing this challenge, which combines the global optimizer CrossEntropy method with local gradient ascent. This paper theoretically investigates how far the estimated gradient is from the true gradient, then empirically demonstrates that this system is effective by applying it to a toy problem, as well as on a real-world face detection task.

Barnabás Póczos, Csaba Szepesv&aacut

Real-time Traffic

Expected Average Reward | Face Detection Task | ICML 2009 | Local Gradient Ascent | Machine Learning |

claim paper

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2009
Where	ICML
Authors	Barnabás Póczos, Csaba Szepesvári, Nathan R. Sturtevant, Russell Greiner, Yasin Abbasi-Yadkori

Comments (0)

Sciweavers

Learning when to stop thinking and do something!

Expected Average Reward | Face Detection Task | ICML 2009 | Local Gradient Ascent | Machine Learning |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers