Sciweavers

97 search results - page 11 / 20
» Logarithmic Regret Algorithms for Online Convex Optimization
Sort
View
ICML
2003
IEEE
14 years 8 months ago
Online Convex Programming and Generalized Infinitesimal Gradient Ascent
Convex programming involves a convex set F Rn and a convex cost function c : F R. The goal of convex programming is to find a point in F which minimizes c. In online convex prog...
Martin Zinkevich
COLT
2010
Springer
13 years 5 months ago
Open Loop Optimistic Planning
We consider the problem of planning in a stochastic and discounted environment with a limited numerical budget. More precisely, we investigate strategies exploring the set of poss...
Sébastien Bubeck, Rémi Munos
COLT
2010
Springer
13 years 5 months ago
Best Arm Identification in Multi-Armed Bandits
We consider the problem of finding the best arm in a stochastic multi-armed bandit game. The regret of a forecaster is here defined by the gap between the mean reward of the optim...
Jean-Yves Audibert, Sébastien Bubeck, R&eac...
LION
2010
Springer
190views Optimization» more  LION 2010»
13 years 11 months ago
Algorithm Selection as a Bandit Problem with Unbounded Losses
Abstract. Algorithm selection is typically based on models of algorithm performance learned during a separate offline training sequence, which can be prohibitively expensive. In r...
Matteo Gagliolo, Jürgen Schmidhuber
COLT
2004
Springer
14 years 1 months ago
Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary
We give an algorithm for the bandit version of a very general online optimization problem considered by Kalai and Vempala [1], for the case of an adaptive adversary. In this proble...
H. Brendan McMahan, Avrim Blum