multi-armed bandit problem

217

CORR
2006
Springer

140views Education» more CORR 2006»

Nearly optimal exploration-exploitation decision thresholds

15 years 6 months ago

While in general trading off exploration and exploitation in reinforcement learning is hard, under some formulations relatively simple solutions exist. Optimal decision thresholds ...

Christos Dimitrakakis

posted by olethros

Read More »

186

click to vote

NIPS
2004

136views Information Technology» more NIPS 2004»

Nearly Tight Bounds for the Continuum-Armed Bandit Problem

15 years 8 months ago

Download books.nips.cc

In the multi-armed bandit problem, an online algorithm must choose from a set of strategies in a sequence of n trials so as to minimize the total cost of the chosen strategies. Wh...

Robert D. Kleinberg

claim paper

Read More »

180

click to vote

SDM
2007
SIAM

167views Data Mining» more SDM 2007»

Bandits for Taxonomies: A Model-based Approach

15 years 8 months ago

Download www.cs.cmu.edu

We consider a novel problem of learning an optimal matching, in an online fashion, between two feature spaces that are organized as taxonomies. We formulate this as a multi-armed ...

Sandeep Pandey, Deepak Agarwal, Deepayan Chakrabar...

claim paper

Read More »

175

click to vote

COLT
2008
Springer

115views Machine Learning» more COLT 2008»

Competing in the Dark: An Efficient Algorithm for Bandit Linear Optimization

15 years 8 months ago

Download www-stat.wharton.upenn.edu

We introduce an efficient algorithm for the problem of online linear optimization in the bandit setting which achieves the optimal O ( T) regret. The setting is a natural general...

Jacob Abernethy, Elad Hazan, Alexander Rakhlin

claim paper

Read More »

191

click to vote

COLT
2003
Springer

121views Machine Learning» more COLT 2003»

Lower Bounds on the Sample Complexity of Exploration in the Multi-armed Bandit Problem

15 years 12 months ago

Download www.ece.mcgill.ca

We consider the Multi-armed bandit problem under the PAC (“probably approximately correct”) model. It was shown by Even-Dar et al. [5] that given n arms, it suﬃces to play th...

Shie Mannor, John N. Tsitsiklis

claim paper

Read More »

164

click to vote

ECML
2005
Springer

105views Machine Learning» more ECML 2005»

Multi-armed Bandit Algorithms and Empirical Evaluation

16 years 4 days ago

Download www.cs.nyu.edu

The multi-armed bandit problem for a gambler is to decide which arm of a K-slot machine to pull to maximize his total reward in a series of trials. Many real-world learning and opt...

Joannès Vermorel, Mehryar Mohri

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers