Learning when to stop thinking and do something!

16 years 7 months ago

Download www.cs.ualberta.ca

An anytime algorithm is capable of returning a response to the given task at essentially any time; typically the quality of the response improves as the time increases. Here, we consider the challenge of learning when we should terminate such algorithms on each of a sequence of iid tasks, to optimize the expected average reward per unit time. We provide a system for addressing this challenge, which combines the global optimizer CrossEntropy method with local gradient ascent. This paper theoretically investigates how far the estimated gradient is from the true gradient, then empirically demonstrates that this system is effective by applying it to a toy problem, as well as on a real-world face detection task.

Barnabás Póczos, Csaba Szepesv&aacut

Real-time Traffic

Expected Average Reward | Face Detection Task | ICML 2009 | Local Gradient Ascent | Machine Learning |

claim paper

Related Content

» You must be joking interview this

» What do usability evaluators do in practice an explorative study of thinkaloud testing

» Davis Social Links or How I Learned to Stop Worrying and Love the Net

» A bisimulationbased approach to the analysis of humancomputer interaction

» Ontology Use and Abuse

» Requirements for reflective argument visualization tools A Case for Using Validity as a No...

» Supporting Cooperative Learning of Process Knowledge on the World Wide Web

» Automated Postediting of Documents

» Digital Instruments and Players Part I Efficiency and Apprenticeship

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2009
Where	ICML
Authors	Barnabás Póczos, Csaba Szepesvári, Nathan R. Sturtevant, Russell Greiner, Yasin Abbasi-Yadkori

Comments (0)

Sciweavers

Learning when to stop thinking and do something!

Expected Average Reward | Face Detection Task | ICML 2009 | Local Gradient Ascent | Machine Learning |

Explore & Download

Productivity Tools

Sciweavers