Search Sciweavers | Sciweavers

44 search results - page 4 / 9

» A structured multiarmed bandit problem and the greedy policy

click to vote

CORR
2010
Springer

187views Education» more CORR 2010»

Learning in A Changing World: Non-Bayesian Restless Multi-Armed Bandit

13 years 7 months ago

Download www.ece.ucdavis.edu

We consider the restless multi-armed bandit (RMAB) problem with unknown dynamics. In this problem, at each time, a player chooses K out of N (N > K) arms to play. The state of ...

Haoyang Liu, Keqin Liu, Qing Zhao

claim paper

Read More »

click to vote

CORR
2010
Springer

152views Education» more CORR 2010»

Combinatorial Network Optimization with Unknown Variables: Multi-Armed Bandits with Linear Rewards

13 years 2 months ago

Download ceng.usc.edu

In the classic multi-armed bandits problem, the goal is to have a policy for dynamically operating arms that each yield stochastic rewards with unknown means. The key metric of int...

Yi Gai, Bhaskar Krishnamachari, Rahul Jain

claim paper

Read More »

click to vote

ICASSP
2011
IEEE

177views Signal Processing» more ICASSP 2011»

Logarithmic weak regret of non-Bayesian restless multi-armed bandit

12 years 11 months ago

Download www.ece.ucdavis.edu

Abstract—We consider the restless multi-armed bandit (RMAB) problem with unknown dynamics. At each time, a player chooses K out of N (N > K) arms to play. The state of each ar...

Haoyang Liu, Keqin Liu, Qing Zhao

claim paper

Read More »

click to vote

COLT
2010
Springer

191views Machine Learning» more COLT 2010»

Best Arm Identification in Multi-Armed Bandits

13 years 5 months ago

Download www.di.ens.fr

We consider the problem of finding the best arm in a stochastic multi-armed bandit game. The regret of a forecaster is here defined by the gap between the mean reward of the optim...

Jean-Yves Audibert, Sébastien Bubeck, R&eac...

claim paper

Read More »

click to vote

JMLR
2012

165views Programming Languages» more JMLR 2012»

PAC-Bayes-Bernstein Inequality for Martingales and its Application to Multiarmed Bandits

11 years 9 months ago

Download homes.di.unimi.it

We develop a new tool for data-dependent analysis of the exploration-exploitation trade-oﬀ in learning under limited feedback. Our tool is based on two main ingredients. The ﬁ...

Yevgeny Seldin, Nicolò Cesa-Bianchi, Peter ...

claim paper

Read More »

« Prev « First page 4 / 9 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers