bandits | Sciweavers

41

JMLR
2012

165views Programming Languages» more JMLR 2012»

PAC-Bayes-Bernstein Inequality for Martingales and its Application to Multiarmed Bandits

12 years 4 months ago

We develop a new tool for data-dependent analysis of the exploration-exploitation trade-oﬀ in learning under limited feedback. Our tool is based on two main ingredients. The ﬁ...

Yevgeny Seldin, Nicolò Cesa-Bianchi, Peter ...

claim paper

Read More »

69

click to vote

AMAI
2011
Springer

273views Artificial Intelligence» more AMAI 2011»

Multi-armed bandits with episode context

13 years 1 months ago

Download gauss.ececs.uc.edu

A multi-armed bandit episode consists of n trials, each allowing selection of one of K arms, resulting in payoff from a distribution over [0, 1] associated with that arm. We assum...

Christopher D. Rosin

claim paper

Read More »

46

click to vote

CORR
2011
Springer

202views Education» more CORR 2011»

Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems

13 years 8 months ago

Download www.ualberta.ca

The analysis of online least squares estimation is at the heart of many stochastic sequential decision-making problems. We employ tools from the self-normalized processes to provi...

Yasin Abbasi-Yadkori, Dávid Pál, Csa...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers