Sciweavers

COLT
2003
Springer

Lower Bounds on the Sample Complexity of Exploration in the Multi-armed Bandit Problem

14 years 4 months ago
Lower Bounds on the Sample Complexity of Exploration in the Multi-armed Bandit Problem
We consider the Multi-armed bandit problem under the PAC (“probably approximately correct”) model. It was shown by Even-Dar et al. [5] that given n arms, it suffices to play the arms a total of O (n/ε2 ) log(1/δ) times to find an ε-optimal arm with probability of at least 1−δ. Our contribution is a matching lower bound that holds for any sampling policy. We also generalize the lower bound to a Bayesian setting, and to the case where the statistics of the arms are known but the identities of the arms are not.
Shie Mannor, John N. Tsitsiklis
Added 06 Jul 2010
Updated 06 Jul 2010
Type Conference
Year 2003
Where COLT
Authors Shie Mannor, John N. Tsitsiklis
Comments (0)