Search Sciweavers | Sciweavers

5 search results - page 1 / 1

» Regret Bounds for Sleeping Experts and Bandits

click to vote

COLT
2008
Springer

140views Machine Learning» more COLT 2008»

Regret Bounds for Sleeping Experts and Bandits

13 years 9 months ago

Download colt2008.cs.helsinki.fi

We study on-line decision problems where the set of actions that are available to the decision algorithm vary over time. With a few notable exceptions, such problems remained larg...

Robert D. Kleinberg, Alexandru Niculescu-Mizil, Yo...

claim paper

Read More »

click to vote

COLT
2005
Springer

128views Machine Learning» more COLT 2005»

From External to Internal Regret

13 years 9 months ago

Download www.cs.cmu.edu

External regret compares the performance of an online algorithm, selecting among N actions, to the performance of the best of those actions in hindsight. Internal regret compares ...

Avrim Blum, Yishay Mansour

claim paper

Read More »

click to vote

CORR
2011
Springer

202views Education» more CORR 2011»

Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems

13 years 2 months ago

Download www.ualberta.ca

The analysis of online least squares estimation is at the heart of many stochastic sequential decision-making problems. We employ tools from the self-normalized processes to provi...

Yasin Abbasi-Yadkori, Dávid Pál, Csa...

claim paper

Read More »

click to vote

JMLR
2010

103views more JMLR 2010»

Regret Bounds and Minimax Policies under Partial Monitoring

13 years 2 months ago

Download jmlr.csail.mit.edu

This work deals with four classical prediction settings, namely full information, bandit, label efficient and bandit label efficient as well as four different notions of regret: p...

Jean-Yves Audibert, Sébastien Bubeck

claim paper

Read More »

click to vote

LION
2010
Springer

190views Optimization» more LION 2010»

Algorithm Selection as a Bandit Problem with Unbounded Losses

13 years 11 months ago

Download como.vub.ac.be

Abstract. Algorithm selection is typically based on models of algorithm performance learned during a separate ofﬂine training sequence, which can be prohibitively expensive. In r...

Matteo Gagliolo, Jürgen Schmidhuber

claim paper

Read More »

« Prev « First page 1 / 1 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers