Sciweavers

56 search results - page 6 / 12
» Q-Learning in Continuous State and Action Spaces
Sort
View
ASE
2004
167views more  ASE 2004»
15 years 5 months ago
Cluster-Based Partial-Order Reduction
The verification of concurrent systems through an exhaustive traversal of the state space suffers from the infamous state-space-explosion problem, caused by the many interleavings ...
Twan Basten, Dragan Bosnacki, Marc Geilen
145
Voted
ICML
2003
IEEE
16 years 6 months ago
Hierarchical Policy Gradient Algorithms
Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning...
Mohammad Ghavamzadeh, Sridhar Mahadevan
ICML
2006
IEEE
16 years 6 months ago
PAC model-free reinforcement learning
For a Markov Decision Process with finite state (size S) and action spaces (size A per state), we propose a new algorithm--Delayed Q-Learning. We prove it is PAC, achieving near o...
Alexander L. Strehl, Lihong Li, Eric Wiewiora, Joh...
IROS
2008
IEEE
144views Robotics» more  IROS 2008»
16 years 6 days ago
Learning nonparametric policies by imitation
— A long cherished goal in artificial intelligence has been the ability to endow a robot with the capacity to learn and generalize skills from watching a human teacher. Such an ...
David B. Grimes, Rajesh P. N. Rao
UAI
2000
15 years 7 months ago
PEGASUS: A policy search method for large MDPs and POMDPs
We propose a new approach to the problem of searching a space of policies for a Markov decision process (MDP) or a partially observable Markov decision process (POMDP), given a mo...
Andrew Y. Ng, Michael I. Jordan