Sciweavers

CORR
2012
Springer
210views Education» more  CORR 2012»
12 years 8 months ago
Towards minimax policies for online linear optimization with bandit feedback
We address the online linear optimization problem with bandit feedback. Our contribution is twofold. First, we provide an algorithm (based on exponential weights) with a regret of...
Sébastien Bubeck, Nicolò Cesa-Bianch...