Sciweavers

13 search results - page 1 / 3
» Beyond Logarithmic Bounds in Online Learning
Sort
View
JMLR
2012
12 years 24 days ago
Beyond Logarithmic Bounds in Online Learning
We prove logarithmic regret bounds that depend on the loss L∗ T of the competitor rather than on the number T of time steps. In the general online convex optimization setting, o...
Francesco Orabona, Nicolò Cesa-Bianchi, Cla...
ALT
2008
Springer
14 years 7 months ago
Online Regret Bounds for Markov Decision Processes with Deterministic Transitions
Abstract. We consider an upper confidence bound algorithm for Markov decision processes (MDPs) with deterministic transitions. For this algorithm we derive upper bounds on the onl...
Ronald Ortner
COLT
2006
Springer
14 years 2 months ago
Logarithmic Regret Algorithms for Online Convex Optimization
In an online convex optimization problem a decision-maker makes a sequence of decisions, i.e., chooses a sequence of points in Euclidean space, from a fixed feasible set. After ea...
Elad Hazan, Adam Kalai, Satyen Kale, Amit Agarwal
CORR
2011
Springer
210views Education» more  CORR 2011»
13 years 5 months ago
Online Learning of Rested and Restless Bandits
In this paper we study the online learning problem involving rested and restless multiarmed bandits with multiple plays. The system consists of a single player/user and a set of K...
Cem Tekin, Mingyan Liu
ICML
2007
IEEE
14 years 11 months ago
Online kernel PCA with entropic matrix updates
A number of updates for density matrices have been developed recently that are motivated by relative entropy minimization problems. The updates involve a softmin calculation based...
Dima Kuzmin, Manfred K. Warmuth