Sciweavers

66 search results - page 1 / 14
» The Nonstochastic Multiarmed Bandit Problem
Sort
View
COLT
2008
Springer
13 years 9 months ago
Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization
We introduce an efficient algorithm for the problem of online linear optimization in the bandit setting which achieves the optimal O ( T) regret. The setting is a natural general...
Jacob Abernethy, Elad Hazan, Alexander Rakhlin
SIAMCOMP
2002
124views more  SIAMCOMP 2002»
13 years 7 months ago
The Nonstochastic Multiarmed Bandit Problem
Abstract. In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot machines to play in a sequence of trials so as to maximize his reward. This class...
Peter Auer, Nicolò Cesa-Bianchi, Yoav Freun...
JMLR
2012
11 years 10 months ago
PAC-Bayes-Bernstein Inequality for Martingales and its Application to Multiarmed Bandits
We develop a new tool for data-dependent analysis of the exploration-exploitation trade-off in learning under limited feedback. Our tool is based on two main ingredients. The fi...
Yevgeny Seldin, Nicolò Cesa-Bianchi, Peter ...
CORR
2012
Springer
192views Education» more  CORR 2012»
12 years 3 months ago
The best of both worlds: stochastic and adversarial bandits
We present a bandit algorithm, SAO (Stochastic and Adversarial Optimal), whose regret is, essentially, optimal both for adversarial rewards and for stochastic rewards. Specifical...
Sébastien Bubeck, Aleksandrs Slivkins
ECML
2005
Springer
14 years 1 months ago
Multi-armed Bandit Algorithms and Empirical Evaluation
The multi-armed bandit problem for a gambler is to decide which arm of a K-slot machine to pull to maximize his total reward in a series of trials. Many real-world learning and opt...
Joannès Vermorel, Mehryar Mohri