Sciweavers

113 search results - page 9 / 23
» Model Approximation for HEXQ Hierarchical Reinforcement Lear...
Sort
View
ICML
2006
IEEE
14 years 8 months ago
Using inaccurate models in reinforcement learning
In the model-based policy search approach to reinforcement learning (RL), policies are found using a model (or "simulator") of the Markov decision process. However, for ...
Pieter Abbeel, Morgan Quigley, Andrew Y. Ng
PKDD
2010
Springer
179views Data Mining» more  PKDD 2010»
13 years 5 months ago
Gaussian Processes for Sample Efficient Reinforcement Learning with RMAX-Like Exploration
Abstract. We present an implementation of model-based online reinforcement learning (RL) for continuous domains with deterministic transitions that is specifically designed to achi...
Tobias Jung, Peter Stone
ECAI
2008
Springer
13 years 9 months ago
Exploiting locality of interactions using a policy-gradient approach in multiagent learning
In this paper, we propose a policy gradient reinforcement learning algorithm to address transition-independent Dec-POMDPs. This approach aims at implicitly exploiting the locality...
Francisco S. Melo
ATAL
2008
Springer
13 years 9 months ago
Reinforcement learning for DEC-MDPs with changing action sets and partially ordered dependencies
Decentralized Markov decision processes are frequently used to model cooperative multi-agent systems. In this paper, we identify a subclass of general DEC-MDPs that features regul...
Thomas Gabel, Martin A. Riedmiller
ATAL
2008
Springer
13 years 9 months ago
Sigma point policy iteration
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...