Sciweavers

2 search results - page 1 / 1
» Piecewise-stationary bandit problems with side observations
Sort
View
ICML
2009
IEEE
14 years 8 months ago
Piecewise-stationary bandit problems with side observations
We consider a sequential decision problem where the rewards are generated by a piecewise-stationary distribution. However, the different reward distributions are unknown and may c...
Jia Yuan Yu, Shie Mannor
ICML
2008
IEEE
14 years 8 months ago
Exploration scavenging
We examine the problem of evaluating a policy in the contextual bandit setting using only observations collected during the execution of another policy. We show that policy evalua...
John Langford, Alexander L. Strehl, Jennifer Wortm...