Sciweavers

102 search results - page 8 / 21
» MDPs with Non-Deterministic Policies
Sort
View
ATAL
2006
Springer
13 years 11 months ago
Resource allocation among agents with preferences induced by factored MDPs
Distributing scarce resources among agents in a way that maximizes the social welfare of the group is a computationally hard problem when the value of a resource bundle is not lin...
Dmitri A. Dolgov, Edmund H. Durfee
AIPS
2009
13 years 8 months ago
Efficient Solutions to Factored MDPs with Imprecise Transition Probabilities
When modeling real-world decision-theoretic planning problems in the Markov decision process (MDP) framework, it is often impossible to obtain a completely accurate estimate of tr...
Karina Valdivia Delgado, Scott Sanner, Leliane Nun...
IJCNN
2008
IEEE
14 years 1 months ago
Uncertainty propagation for quality assurance in Reinforcement Learning
— In this paper we address the reliability of policies derived by Reinforcement Learning on a limited amount of observations. This can be done in a principled manner by taking in...
Daniel Schneegaß, Steffen Udluft, Thomas Mar...
ISAAC
2010
Springer
243views Algorithms» more  ISAAC 2010»
13 years 5 months ago
Lower Bounds for Howard's Algorithm for Finding Minimum Mean-Cost Cycles
Howard's policy iteration algorithm is one of the most widely used algorithms for finding optimal policies for controlling Markov Decision Processes (MDPs). When applied to we...
Thomas Dueholm Hansen, Uri Zwick
RAS
2010
131views more  RAS 2010»
13 years 5 months ago
Probabilistic Policy Reuse for inter-task transfer learning
Policy Reuse is a reinforcement learning technique that efficiently learns a new policy by using past similar learned policies. The Policy Reuse learner improves its exploration b...
Fernando Fernández, Javier García, M...