Sciweavers

97 search results - page 11 / 20
» Logarithmic Regret Algorithms for Online Convex Optimization
Sort
View
ICML
2003
IEEE
16 years 6 months ago
Online Convex Programming and Generalized Infinitesimal Gradient Ascent
Convex programming involves a convex set F Rn and a convex cost function c : F R. The goal of convex programming is to find a point in F which minimizes c. In online convex prog...
Martin Zinkevich
173
Voted
COLT
2010
Springer
15 years 3 months ago
Open Loop Optimistic Planning
We consider the problem of planning in a stochastic and discounted environment with a limited numerical budget. More precisely, we investigate strategies exploring the set of poss...
Sébastien Bubeck, Rémi Munos
COLT
2010
Springer
15 years 3 months ago
Best Arm Identification in Multi-Armed Bandits
We consider the problem of finding the best arm in a stochastic multi-armed bandit game. The regret of a forecaster is here defined by the gap between the mean reward of the optim...
Jean-Yves Audibert, Sébastien Bubeck, R&eac...
LION
2010
Springer
190views Optimization» more  LION 2010»
15 years 9 months ago
Algorithm Selection as a Bandit Problem with Unbounded Losses
Abstract. Algorithm selection is typically based on models of algorithm performance learned during a separate offline training sequence, which can be prohibitively expensive. In r...
Matteo Gagliolo, Jürgen Schmidhuber
146
Voted
COLT
2004
Springer
15 years 11 months ago
Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary
We give an algorithm for the bandit version of a very general online optimization problem considered by Kalai and Vempala [1], for the case of an adaptive adversary. In this proble...
H. Brendan McMahan, Avrim Blum