Search Sciweavers | Sciweavers

263 search results - page 14 / 53

» Regret Bounds for Prediction Problems

171

click to vote

COLT
2006
Springer

63views Machine Learning» more COLT 2006»

Online Learning with Constraints

15 years 9 months ago

Download isaim2008.unl.edu

In this paper, we study a sequential decision making problem. The objective is to maximize the total reward while satisfying constraints, which are defined at every time step. The...

Shie Mannor, John N. Tsitsiklis

claim paper

Read More »

178

Voted

NIPS
2004

103views Information Technology» more NIPS 2004»

Experts in a Markov Decision Process

15 years 7 months ago

Download books.nips.cc

We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Simi...

Eyal Even-Dar, Sham M. Kakade, Yishay Mansour

claim paper

Read More »

164

click to vote

UAI
2004

108views Artificial Intelligence» more UAI 2004»

Heuristic Search Value Iteration for POMDPs

15 years 7 months ago

Download www.cs.cmu.edu

We present a novel POMDP planning algorithm called heuristic search value iteration (HSVI). HSVI is an anytime algorithm that returns a policy and a provable bound on its regret w...

Trey Smith, Reid G. Simmons

claim paper

Read More »

146

click to vote

CORR
2006
Springer

83views Education» more CORR 2006»

How to Beat the Adaptive Multi-Armed Bandit

15 years 6 months ago

Download people.cs.uchicago.edu

The multi-armed bandit is a concise model for the problem of iterated decision-making under uncertainty. In each round, a gambler must pull one of K arms of a slot machine, withou...

Varsha Dani, Thomas P. Hayes

claim paper

Read More »

139

click to vote

ICML
2001
IEEE

129views Machine Learning» more ICML 2001»

General Loss Bounds for Universal Sequence Prediction

16 years 6 months ago

Download www.hutter1.net

The Bayesian framework is ideally suited for induction problems. The probability of observing xt at

Marcus Hutter