Sciweavers

271 search results - page 55 / 55
» Identifying Optimal Sequential Decisions
Sort
View
AMAI
2011
Springer
12 years 7 months ago
Multi-armed bandits with episode context
A multi-armed bandit episode consists of n trials, each allowing selection of one of K arms, resulting in payoff from a distribution over [0, 1] associated with that arm. We assum...
Christopher D. Rosin