Sciweavers

232 search results - page 7 / 47
» Learning all optimal policies with multiple criteria
Sort
View
EOR
2006
81views more  EOR 2006»
13 years 8 months ago
Optimal and near-optimal policies for lost sales inventory models with at most one replenishment order outstanding
In this paper we use policy-iteration to explore the behaviour of optimal control policies for lost sales inventory models with the constraint that not more than one replenishment...
Roger M. Hill, Søren Glud Johansen
CORR
2010
Springer
49views Education» more  CORR 2010»
13 years 8 months ago
Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret
The problem of distributed learning and channel access is considered in a cognitive network with multiple secondary users. The availability statistics of the channels are initially...
Animashree Anandkumar, Nithin Michael, Ao Kevin Ta...
ECAI
2010
Springer
13 years 9 months ago
On Finding Compromise Solutions in Multiobjective Markov Decision Processes
A Markov Decision Process (MDP) is a general model for solving planning problems under uncertainty. It has been extended to multiobjective MDP to address multicriteria or multiagen...
Patrice Perny, Paul Weng
CIKM
2008
Springer
13 years 10 months ago
Proactive learning: cost-sensitive active learning with multiple imperfect oracles
Proactive learning is a generalization of active learning designed to relax unrealistic assumptions and thereby reach practical applications. Active learning seeks to select the m...
Pinar Donmez, Jaime G. Carbonell
COLT
2010
Springer
13 years 6 months ago
An Asymptotically Optimal Bandit Algorithm for Bounded Support Models
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
Junya Honda, Akimichi Takemura