Sciweavers

473 search results - page 85 / 95
» Optimal policy switching algorithms for reinforcement learni...
Sort
View
CPAIOR
2007
Springer
14 years 1 months ago
Solving a Stochastic Queueing Control Problem with Constraint Programming
In a facility with front room and back room operations, it is useful to switch workers between the rooms in order to cope with changing customer demand. Assuming stochastic custome...
Daria Terekhov, J. Christopher Beck
ICML
2005
IEEE
14 years 8 months ago
Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees
MDPs are an attractive formalization for planning, but realistic problems often have intractably large state spaces. When we only need a partial policy to get from a fixed start s...
H. Brendan McMahan, Maxim Likhachev, Geoffrey J. G...
ICML
1996
IEEE
14 years 8 months ago
Learning Evaluation Functions for Large Acyclic Domains
Some of the most successful recent applications of reinforcement learning have used neural networks and the TD algorithm to learn evaluation functions. In this paper, we examine t...
Justin A. Boyan, Andrew W. Moore
ATAL
2006
Springer
13 years 11 months ago
Learning to cooperate in multi-agent social dilemmas
In many Multi-Agent Systems (MAS), agents (even if selfinterested) need to cooperate in order to maximize their own utilities. Most of the multi-agent learning algorithms focus on...
Jose Enrique Munoz de Cote, Alessandro Lazaric, Ma...
ECML
2005
Springer
14 years 1 months ago
Active Learning in Partially Observable Markov Decision Processes
This paper examines the problem of finding an optimal policy for a Partially Observable Markov Decision Process (POMDP) when the model is not known or is only poorly specified. W...
Robin Jaulmes, Joelle Pineau, Doina Precup