Sciweavers

74 search results - page 7 / 15
» Regret Bounds for Gaussian Process Bandit Problems
Sort
View
ICASSP
2010
IEEE
13 years 7 months ago
Distributed learning in cognitive radio networks: Multi-armed bandit with distributed multiple players
—We consider a cognitive radio network with distributed multiple secondary users, where each user independently searches for spectrum opportunities in multiple channels without e...
Keqin Liu, Qing Zhao
COLT
2004
Springer
14 years 28 days ago
Online Geometric Optimization in the Bandit Setting Against an Adaptive Adversary
We give an algorithm for the bandit version of a very general online optimization problem considered by Kalai and Vempala [1], for the case of an adaptive adversary. In this proble...
H. Brendan McMahan, Avrim Blum
COLT
2010
Springer
13 years 5 months ago
Open Loop Optimistic Planning
We consider the problem of planning in a stochastic and discounted environment with a limited numerical budget. More precisely, we investigate strategies exploring the set of poss...
Sébastien Bubeck, Rémi Munos
COLT
2004
Springer
13 years 11 months ago
Regret Bounds for Hierarchical Classification with Linear-Threshold Functions
We study the problem of classifying data in a given taxonomy when classifications associated with multiple and/or partial paths are allowed. We introduce an incremental algorithm u...
Nicolò Cesa-Bianchi, Alex Conconi, Claudio ...
SIAMCOMP
2002
124views more  SIAMCOMP 2002»
13 years 7 months ago
The Nonstochastic Multiarmed Bandit Problem
Abstract. In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot machines to play in a sequence of trials so as to maximize his reward. This class...
Peter Auer, Nicolò Cesa-Bianchi, Yoav Freun...