Abstract. Algorithm selection is typically based on models of algorithm performance learned during a separate offline training sequence, which can be prohibitively expensive. In r...
We give an algorithm for the bandit version of a very general online optimization problem considered by Kalai and Vempala [1], for the case of an adaptive adversary. In this proble...
Research in reinforcement learning has produced algorithms for optimal decision making under uncertainty that fall within two main types. The first employs a Bayesian framework, ...
We study the regret of an online learner playing a multi-round game in a Banach space B against an adversary that plays a convex function at each round. We characterize the minima...
We study the rates of growth of the regret in online convex optimization. First, we show that a simple extension of the algorithm of Hazan et al eliminates the need for a priori k...