Search Sciweavers | Sciweavers

232 search results - page 7 / 47

» Learning all optimal policies with multiple criteria

click to vote

EOR
2006

81views more EOR 2006»

Optimal and near-optimal policies for lost sales inventory models with at most one replenishment order outstanding

13 years 8 months ago

Download home.imf.au.dk

In this paper we use policy-iteration to explore the behaviour of optimal control policies for lost sales inventory models with the constraint that not more than one replenishment...

Roger M. Hill, Søren Glud Johansen

claim paper

Read More »

click to vote

CORR
2010
Springer

49views Education» more CORR 2010»

Distributed Algorithms for Learning and Cognitive Medium Access with Logarithmic Regret

13 years 8 months ago

Download www.dtic.mil

The problem of distributed learning and channel access is considered in a cognitive network with multiple secondary users. The availability statistics of the channels are initially...

Animashree Anandkumar, Nithin Michael, Ao Kevin Ta...

claim paper

Read More »

click to vote

ECAI
2010
Springer

227views Artificial Intelligence» more ECAI 2010»

On Finding Compromise Solutions in Multiobjective Markov Decision Processes

13 years 9 months ago

Download www-desir.lip6.fr

A Markov Decision Process (MDP) is a general model for solving planning problems under uncertainty. It has been extended to multiobjective MDP to address multicriteria or multiagen...

Patrice Perny, Paul Weng

claim paper

Read More »

click to vote

CIKM
2008
Springer

97views Information Technology» more CIKM 2008»

Proactive learning: cost-sensitive active learning with multiple imperfect oracles

13 years 10 months ago

Download www.cs.cmu.edu

Proactive learning is a generalization of active learning designed to relax unrealistic assumptions and thereby reach practical applications. Active learning seeks to select the m...

Pinar Donmez, Jaime G. Carbonell

claim paper

Read More »

click to vote

COLT
2010
Springer

207views Machine Learning» more COLT 2010»

An Asymptotically Optimal Bandit Algorithm for Bounded Support Models

13 years 6 months ago

Download www.colt2010.org

Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...

Junya Honda, Akimichi Takemura

claim paper

Read More »

« Prev « First page 7 / 47 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers