Search Sciweavers | Sciweavers

263 search results - page 9 / 53

» Regret Bounds for Prediction Problems

225

click to vote

ICASSP
2011
IEEE

177views Signal Processing» more ICASSP 2011»

Logarithmic weak regret of non-Bayesian restless multi-armed bandit

14 years 11 months ago

Download www.ece.ucdavis.edu

Abstract—We consider the restless multi-armed bandit (RMAB) problem with unknown dynamics. At each time, a player chooses K out of N (N > K) arms to play. The state of each ar...

Haoyang Liu, Keqin Liu, Qing Zhao

claim paper

Read More »

169

click to vote

COLT
2007
Springer

98views Machine Learning» more COLT 2007»

Learning Permutations with Exponential Weights

16 years 1 months ago

Download users.soe.ucsc.edu

We give an algorithm for the on-line learning of permutations. The algorithm maintains its uncertainty about the target permutation as a doubly stochastic weight matrix, and makes...

David P. Helmbold, Manfred K. Warmuth

claim paper

Read More »

253

click to vote

CORR
2011
Springer

210views Education» more CORR 2011»

Online Learning of Rested and Restless Bandits

15 years 2 months ago

Download www.eecs.umich.edu

In this paper we study the online learning problem involving rested and restless multiarmed bandits with multiple plays. The system consists of a single player/user and a set of K...

Cem Tekin, Mingyan Liu

claim paper

Read More »

224

click to vote

LION
2010
Springer

190views Optimization» more LION 2010»

Algorithm Selection as a Bandit Problem with Unbounded Losses

15 years 11 months ago

Download como.vub.ac.be

Abstract. Algorithm selection is typically based on models of algorithm performance learned during a separate ofﬂine training sequence, which can be prohibitively expensive. In r...

Matteo Gagliolo, Jürgen Schmidhuber

claim paper

Read More »

169

click to vote

ICML
2009
IEEE

189views Machine Learning» more ICML 2009»

A simpler unified analysis of budget perceptrons

16 years 8 months ago

Download www.cs.utoronto.ca

The kernel Perceptron is an appealing online learning algorithm that has a drawback: whenever it makes an error it must increase its support set, which slows training and testing ...

Ilya Sutskever

claim paper

Read More »

« Prev « First page 9 / 53 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers