Sciweavers

44 search results - page 2 / 9
» A structured multiarmed bandit problem and the greedy policy
Sort
View
CDC
2009
IEEE
123views Control Systems» more  CDC 2009»
14 years 2 days ago
On the myopic policy for a class of restless bandit problems with applications in dynamic multichannel access
We consider a class of restless multi-armed bandit problems that arises in multi-channel opportunistic communications, where channels are modeled as independent and stochastically...
Keqin Liu, Qing Zhao
TSP
2010
13 years 2 months ago
Distributed learning in multi-armed bandit with multiple players
We formulate and study a decentralized multi-armed bandit (MAB) problem. There are distributed players competing for independent arms. Each arm, when played, offers i.i.d. reward a...
Keqin Liu, Qing Zhao
ICASSP
2010
IEEE
13 years 7 months ago
Distributed learning in cognitive radio networks: Multi-armed bandit with distributed multiple players
—We consider a cognitive radio network with distributed multiple secondary users, where each user independently searches for spectrum opportunities in multiple channels without e...
Keqin Liu, Qing Zhao
AGI
2011
12 years 11 months ago
Reinforcement Learning and the Bayesian Control Rule
We present an actor-critic scheme for reinforcement learning in complex domains. The main contribution is to show that planning and I/O dynamics can be separated such that an intra...
Pedro Alejandro Ortega, Daniel Alexander Braun, Si...
COLT
2003
Springer
14 years 17 days ago
Lower Bounds on the Sample Complexity of Exploration in the Multi-armed Bandit Problem
We consider the Multi-armed bandit problem under the PAC (“probably approximately correct”) model. It was shown by Even-Dar et al. [5] that given n arms, it suffices to play th...
Shie Mannor, John N. Tsitsiklis