We study the notion of learning in an oblivious changing environment. Existing online learning algorithms which minimize regret are shown to converge to the average of all locally optimal solutions. We propose a new performance metric, strengthening the standard metric of regret, to capture convergence to locally optimal solutions, and propose efficient algorithms which provably converge at the optimal rate. One application is the portfolio management problem, for which we show that all previous algorithms behave suboptimally under dynamic market conditions. Another application is online routing, for which our adaptive algorithm exploits local congestion patterns and runs in near-linear time. We also give an algorithm for the tree update problem that is statically optimal for every sufficiently long contiguous subsequence of accesses. Our algorithm combines techniques from data streaming algorithms, composition of learning algorithms, and a twist on the standard experts framework.
Elad Hazan, C. Seshadhri