total reward | Sciweavers

76

CORR
2010
Springer

175views Education» more CORR 2010»

On the Combinatorial Multi-Armed Bandit Problem with Markovian Rewards

13 years 9 months ago

We consider a combinatorial generalization of the classical multi-armed bandit problem that is defined as follows. There is a given bipartite graph of M users and N M resources. F...

Yi Gai, Bhaskar Krishnamachari, Mingyan Liu

claim paper

Read More »

52

click to vote

CORR
2010
Springer

152views Education» more CORR 2010»

Combinatorial Network Optimization with Unknown Variables: Multi-Armed Bandits with Linear Rewards

13 years 10 months ago

Download ceng.usc.edu

In the classic multi-armed bandits problem, the goal is to have a policy for dynamically operating arms that each yield stochastic rewards with unknown means. The key metric of int...

Yi Gai, Bhaskar Krishnamachari, Rahul Jain

claim paper

Read More »

53

click to vote

AIPS
2006

130views Artificial Intelligence» more AIPS 2006»

Probabilistic Planning with Nonlinear Utility Functions

14 years 4 months ago

Download www.aaai.org

Researchers often express probabilistic planning problems as Markov decision process models and then maximize the expected total reward. However, it is often rational to maximize ...

Yaxin Liu, Sven Koenig

claim paper

Read More »

56

click to vote

COLT
2006
Springer

63views Machine Learning» more COLT 2006»

Online Learning with Constraints

14 years 6 months ago

Download isaim2008.unl.edu

In this paper, we study a sequential decision making problem. The objective is to maximize the total reward while satisfying constraints, which are defined at every time step. The...

Shie Mannor, John N. Tsitsiklis

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers