Search Sciweavers | Sciweavers

473 search results - page 85 / 95

» Optimal policy switching algorithms for reinforcement learni...

148

click to vote

CPAIOR
2007
Springer

118views Operations Research» more CPAIOR 2007»

Solving a Stochastic Queueing Control Problem with Constraint Programming

16 years 12 days ago

Download tidel.mie.utoronto.ca

In a facility with front room and back room operations, it is useful to switch workers between the rooms in order to cope with changing customer demand. Assuming stochastic custome...

Daria Terekhov, J. Christopher Beck

claim paper

Read More »

154

click to vote

ICML
2005
IEEE

159views Machine Learning» more ICML 2005»

Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees

16 years 7 months ago

Download www.cs.cmu.edu

MDPs are an attractive formalization for planning, but realistic problems often have intractably large state spaces. When we only need a partial policy to get from a fixed start s...

H. Brendan McMahan, Maxim Likhachev, Geoffrey J. G...

claim paper

Read More »

187

click to vote

ICML
1996
IEEE

162views Machine Learning» more ICML 1996»

Learning Evaluation Functions for Large Acyclic Domains

16 years 7 months ago

Download www.ri.cmu.edu

Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...

Justin A. Boyan, Andrew W. Moore

claim paper

Read More »

171

click to vote

ATAL
2006
Springer

147views Intelligent Agents» more ATAL 2006»

Learning to cooperate in multi-agent social dilemmas

15 years 10 months ago

Download sequel.futurs.inria.fr

In many Multi-Agent Systems (MAS), agents (even if selfinterested) need to cooperate in order to maximize their own utilities. Most of the multi-agent learning algorithms focus on...

Jose Enrique Munoz de Cote, Alessandro Lazaric, Ma...

claim paper

Read More »

144

click to vote

ECML
2005
Springer

143views Machine Learning» more ECML 2005»

Active Learning in Partially Observable Markov Decision Processes

15 years 11 months ago

Download www.cs.mcgill.ca

This paper examines the problem of ﬁnding an optimal policy for a Partially Observable Markov Decision Process (POMDP) when the model is not known or is only poorly speciﬁed. W...

Robin Jaulmes, Joelle Pineau, Doina Precup

claim paper

Read More »

« Prev « First page 85 / 95 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers