Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary

15 years 7 months ago

Download www.cs.cmu.edu

We give an algorithm for the bandit version of a very general online optimization problem considered by Kalai and Vempala [1], for the case of an adaptive adversary. In this problem we are given a bounded set S ¢¤£ n of feasible points. At each time step t, the online algorithm must select a point xt ¥ S while simultaneously an adversary selects a cost vector ct ¥ £ n. The algorithm then incurs cost ct ¦ xt. Kalai and Vempala show that even if S is exponentially large (or inﬁnite), so long as we have an efﬁcient algorithm for the ofﬂine problem (given c ¥ £ n, ﬁnd x ¥ S to minimize c ¦ x) and so long as the cost vectors are bounded, one can efﬁciently solve the online problem of performing nearly as well as the best ﬁxed x ¥ S in hindsight. The Kalai-Vempala algorithm assumes that the cost vectors ct are given to the algorithm after each time step. In the “bandit” version of the problem, the algorithm only observes its cost, ct ¦ xt. Awerbuch and Kleinberg ...

H. Brendan McMahan, Avrim Blum

Real-time Traffic