Sciweavers

97 search results - page 9 / 20
» Logarithmic Regret Algorithms for Online Convex Optimization
Sort
View
ECCC
2007
180views more  ECCC 2007»
13 years 7 months ago
Adaptive Algorithms for Online Decision Problems
We study the notion of learning in an oblivious changing environment. Existing online learning algorithms which minimize regret are shown to converge to the average of all locally...
Elad Hazan, C. Seshadhri
COLT
2008
Springer
13 years 9 months ago
Regret Bounds for Sleeping Experts and Bandits
We study on-line decision problems where the set of actions that are available to the decision algorithm vary over time. With a few notable exceptions, such problems remained larg...
Robert D. Kleinberg, Alexandru Niculescu-Mizil, Yo...
NIPS
2007
13 years 9 months ago
The Price of Bandit Information for Online Optimization
In the online linear optimization problem, a learner must choose, in each round, a decision from a set D ⊂ Rn in order to minimize an (unknown and changing) linear cost function...
Varsha Dani, Thomas P. Hayes, Sham Kakade
CORR
2011
Springer
202views Education» more  CORR 2011»
13 years 2 months ago
Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems
The analysis of online least squares estimation is at the heart of many stochastic sequential decision-making problems. We employ tools from the self-normalized processes to provi...
Yasin Abbasi-Yadkori, Dávid Pál, Csa...
JMLR
2010
161views more  JMLR 2010»
13 years 2 months ago
Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization
We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning...
Lin Xiao