Search Sciweavers | Sciweavers

97 search results - page 11 / 20

» Logarithmic Regret Algorithms for Online Convex Optimization

click to vote

ICML
2003
IEEE

201views Machine Learning» more ICML 2003»

Online Convex Programming and Generalized Infinitesimal Gradient Ascent

14 years 8 months ago

Download www.cs.ualberta.ca

Convex programming involves a convex set F Rn and a convex cost function c : F R. The goal of convex programming is to find a point in F which minimizes c. In online convex prog...

Martin Zinkevich

claim paper

Read More »

click to vote

COLT
2010
Springer

149views Machine Learning» more COLT 2010»

Open Loop Optimistic Planning

13 years 5 months ago

Download www.colt2010.org

We consider the problem of planning in a stochastic and discounted environment with a limited numerical budget. More precisely, we investigate strategies exploring the set of poss...

Sébastien Bubeck, Rémi Munos

claim paper

Read More »

click to vote

COLT
2010
Springer

191views Machine Learning» more COLT 2010»

Best Arm Identification in Multi-Armed Bandits

13 years 5 months ago

Download www.di.ens.fr

We consider the problem of finding the best arm in a stochastic multi-armed bandit game. The regret of a forecaster is here defined by the gap between the mean reward of the optim...

Jean-Yves Audibert, Sébastien Bubeck, R&eac...

claim paper

Read More »

click to vote

LION
2010
Springer

190views Optimization» more LION 2010»

Algorithm Selection as a Bandit Problem with Unbounded Losses

13 years 11 months ago

Download como.vub.ac.be

Abstract. Algorithm selection is typically based on models of algorithm performance learned during a separate ofﬂine training sequence, which can be prohibitively expensive. In r...

Matteo Gagliolo, Jürgen Schmidhuber

claim paper

Read More »

click to vote

COLT
2004
Springer

78views Machine Learning» more COLT 2004»

Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary

14 years 1 months ago

Download www.cs.cmu.edu

We give an algorithm for the bandit version of a very general online optimization problem considered by Kalai and Vempala [1], for the case of an adaptive adversary. In this proble...

H. Brendan McMahan, Avrim Blum

claim paper

Read More »

« Prev « First page 11 / 20 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers