Search Sciweavers | Sciweavers

18

CORR
2010
Springer

127views Education» more CORR 2010»

Online Algorithms for the Multi-Armed Bandit Problem with Markovian Rewards

13 years 7 months ago

We consider the classical multi-armed bandit problem with Markovian rewards. When played an arm changes its state in a Markovian fashion while it remains frozen when not played. Th...

Cem Tekin, Mingyan Liu

claim paper

Read More »

21

click to vote

COLT
2006
Springer

63views Machine Learning» more COLT 2006»

Online Learning with Constraints

13 years 11 months ago

Download isaim2008.unl.edu

In this paper, we study a sequential decision making problem. The objective is to maximize the total reward while satisfying constraints, which are defined at every time step. The...

Shie Mannor, John N. Tsitsiklis

claim paper

Read More »

22

click to vote

AAAI
2008

141views Intelligent Agents» more AAAI 2008»

Online Learning with Expert Advice and Finite-Horizon Constraints

13 years 10 months ago

Download www.aaai.org

In this paper, we study a sequential decision making problem. The objective is to maximize the average reward accumulated over time subject to temporal cost constraints. The novel...

Branislav Kveton, Jia Yuan Yu, Georgios Theocharou...

claim paper

Read More »

23

click to vote

ICML
2009
IEEE

136views Machine Learning» more ICML 2009»

Online feature elicitation in interactive optimization

14 years 8 months ago

Download www.cs.toronto.edu

Most models of utility elicitation in decision support and interactive optimization assume a predefined set of "catalog" features over which user preferences are express...

Craig Boutilier, Kevin Regan, Paolo Viappiani

claim paper

Read More »

22

click to vote

CORR
2010
Springer

91views Education» more CORR 2010»

Switching between Hidden Markov Models using Fixed Share

13 years 2 months ago

Download eprints.pascal-network.org

In prediction with expert advice the goal is to design online prediction algorithms that achieve small regret (additional loss on the whole data) compared to a reference scheme. I...

Wouter M. Koolen, Tim van Erven

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers