Sciweavers

66 search results - page 4 / 14
» The Nonstochastic Multiarmed Bandit Problem
Sort
View
CORR
2006
Springer
140views Education» more  CORR 2006»
13 years 7 months ago
Nearly optimal exploration-exploitation decision thresholds
While in general trading off exploration and exploitation in reinforcement learning is hard, under some formulations relatively simple solutions exist. Optimal decision thresholds ...
Christos Dimitrakakis
CDC
2008
IEEE
104views Control Systems» more  CDC 2008»
14 years 2 months ago
A structured multiarmed bandit problem and the greedy policy
—We consider a multiarmed bandit problem where the expected reward of each arm is a linear function of an unknown scalar with a prior distribution. The objective is to choose a s...
Adam J. Mersereau, Paat Rusmevichientong, John N. ...
CORR
2010
Springer
127views Education» more  CORR 2010»
13 years 7 months ago
Online Algorithms for the Multi-Armed Bandit Problem with Markovian Rewards
We consider the classical multi-armed bandit problem with Markovian rewards. When played an arm changes its state in a Markovian fashion while it remains frozen when not played. Th...
Cem Tekin, Mingyan Liu
ALT
2009
Springer
14 years 4 months ago
Pure Exploration in Multi-armed Bandits Problems
Abstract. We consider the framework of stochastic multi-armed bandit problems and study the possibilities and limitations of strategies that explore sequentially the arms. The stra...
Sébastien Bubeck, Rémi Munos, Gilles...
CORR
2010
Springer
175views Education» more  CORR 2010»
13 years 2 months ago
On the Combinatorial Multi-Armed Bandit Problem with Markovian Rewards
We consider a combinatorial generalization of the classical multi-armed bandit problem that is defined as follows. There is a given bipartite graph of M users and N M resources. F...
Yi Gai, Bhaskar Krishnamachari, Mingyan Liu