cumulative regret | Sciweavers

60

CORR
2008
Springer

64views Education» more CORR 2008»

14 years 11 months ago

We consider bandit problems involving a large (possibly infinite) collection of arms, in which the expected reward of each arm is a linear function of an r-dimensional random vect...

Paat Rusmevichientong, John N. Tsitsiklis

claim paper

Read More »

126

click to vote

COLT
2007
Springer

174views Machine Learning» more COLT 2007»

Regret to the Best vs. Regret to the Average

15 years 5 months ago

Download www.math.tau.ac.il

Abstract. We study online regret minimization algorithms in a bicriteria setting, examining not only the standard notion of regret to the best expert, but also the regret to the av...

Eyal Even-Dar, Michael J. Kearns, Yishay Mansour, ...

claim paper

Read More »

75

click to vote

ALT
2009
Springer

128views Machine Learning» more ALT 2009»

Pure Exploration in Multi-armed Bandits Problems

15 years 8 months ago

Download sequel.futurs.inria.fr

Abstract. We consider the framework of stochastic multi-armed bandit problems and study the possibilities and limitations of strategies that explore sequentially the arms. The stra...

Sébastien Bubeck, Rémi Munos, Gilles...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers