We consider a bandit problem which involves sequential sampling from two populations (arms). Each arm produces a noisy reward realization which depends on an observable random cov...
We study the regret of an online learner playing a multi-round game in a Banach space B against an adversary that plays a convex function at each round. We characterize the minima...
In the present paper, we introduce a variant of Gold-style learners that is not required to infer precise descriptions of the languages in a class, but that must find descriptive ...
We analyze the regret, measured in terms of log loss, of the maximum likelihood (ML) sequential prediction strategy. This "follow the leader" strategy also defines one o...
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
We present a new family of subgradient methods that dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative gradie...
One of the most widely used techniques for data clustering is agglomerative clustering. Such algorithms have been long used across many different fields ranging from computational...
A significant Fourier transform (SFT) algorithm, given a threshold and oracle access to a function f, outputs (the frequencies and approximate values of) all the -significant Fou...
We present a new method for regularized convex optimization and analyze it under both online and stochastic optimization settings. In addition to unifying previously known firstor...
John Duchi, Shai Shalev-Shwartz, Yoram Singer, Amb...