Sciweavers

22 search results - page 4 / 5
» High-Probability Regret Bounds for Bandit Online Linear Opti...
Sort
View
LION
2010
Springer
190views Optimization» more  LION 2010»
13 years 11 months ago
Algorithm Selection as a Bandit Problem with Unbounded Losses
Abstract. Algorithm selection is typically based on models of algorithm performance learned during a separate offline training sequence, which can be prohibitively expensive. In r...
Matteo Gagliolo, Jürgen Schmidhuber
COLT
2004
Springer
14 years 27 days ago
Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary
We give an algorithm for the bandit version of a very general online optimization problem considered by Kalai and Vempala [1], for the case of an adaptive adversary. In this proble...
H. Brendan McMahan, Avrim Blum
CIMCA
2008
IEEE
14 years 2 months ago
Tree Exploration for Bayesian RL Exploration
Research in reinforcement learning has produced algorithms for optimal decision making under uncertainty that fall within two main types. The first employs a Bayesian framework, ...
Christos Dimitrakakis
COLT
2010
Springer
13 years 5 months ago
Convex Games in Banach Spaces
We study the regret of an online learner playing a multi-round game in a Banach space B against an adversary that plays a convex function at each round. We characterize the minima...
Karthik Sridharan, Ambuj Tewari
NIPS
2007
13 years 9 months ago
Adaptive Online Gradient Descent
We study the rates of growth of the regret in online convex optimization. First, we show that a simple extension of the algorithm of Hazan et al eliminates the need for a priori k...
Peter L. Bartlett, Elad Hazan, Alexander Rakhlin