Sciweavers

CDC
2008
IEEE
104views Control Systems» more  CDC 2008»
14 years 6 months ago
A structured multiarmed bandit problem and the greedy policy
—We consider a multiarmed bandit problem where the expected reward of each arm is a linear function of an unknown scalar with a prior distribution. The objective is to choose a s...
Adam J. Mersereau, Paat Rusmevichientong, John N. ...