Search Sciweavers | Sciweavers

22 search results - page 4 / 5

» High-Probability Regret Bounds for Bandit Online Linear Opti...

click to vote

LION
2010
Springer

190views Optimization» more LION 2010»

Algorithm Selection as a Bandit Problem with Unbounded Losses

13 years 11 months ago

Download como.vub.ac.be

Abstract. Algorithm selection is typically based on models of algorithm performance learned during a separate ofﬂine training sequence, which can be prohibitively expensive. In r...

Matteo Gagliolo, Jürgen Schmidhuber

claim paper

Read More »

click to vote

COLT
2004
Springer

78views Machine Learning» more COLT 2004»

Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary

14 years 27 days ago

Download www.cs.cmu.edu

We give an algorithm for the bandit version of a very general online optimization problem considered by Kalai and Vempala [1], for the case of an adaptive adversary. In this proble...

H. Brendan McMahan, Avrim Blum

claim paper

Read More »

click to vote

CIMCA
2008
IEEE

125views Intelligent Agents» more CIMCA 2008»

Tree Exploration for Bayesian RL Exploration

14 years 2 months ago

Download arxiv.org

Research in reinforcement learning has produced algorithms for optimal decision making under uncertainty that fall within two main types. The ﬁrst employs a Bayesian framework, ...

Christos Dimitrakakis

posted by olethros

Read More »

click to vote

COLT
2010
Springer

205views Machine Learning» more COLT 2010»

Convex Games in Banach Spaces

13 years 5 months ago

Download www.cs.utexas.edu

We study the regret of an online learner playing a multi-round game in a Banach space B against an adversary that plays a convex function at each round. We characterize the minima...

Karthik Sridharan, Ambuj Tewari

claim paper

Read More »

click to vote

NIPS
2007

127views Information Technology» more NIPS 2007»

Adaptive Online Gradient Descent

13 years 9 months ago

Download books.nips.cc

We study the rates of growth of the regret in online convex optimization. First, we show that a simple extension of the algorithm of Hazan et al eliminates the need for a priori k...

Peter L. Bartlett, Elad Hazan, Alexander Rakhlin

claim paper

Read More »

« Prev « First page 4 / 5 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers