Sciweavers

340 search results - page 56 / 68
» Kernelized value function approximation for reinforcement le...
Sort
View
209
Voted
BDA
2007
15 years 3 months ago
Hyperplane Queries in a Feature-Space M-tree for Speeding up Active Learning
In content-based retrieval, relevance feedback (RF) is a noticeable method for reducing the “semantic gap” between the low-level features describing the content and the usually...
Michel Crucianu, Daniel Estevez, Vincent Oria, Jea...
116
Voted
ICML
1994
IEEE
15 years 6 months ago
A Modular Q-Learning Architecture for Manipulator Task Decomposition
Compositional Q-Learning (CQ-L) (Singh 1992) is a modular approach to learning to performcomposite tasks made up of several elemental tasks by reinforcement learning. Skills acqui...
Chen K. Tham, Richard W. Prager
AAAI
2011
14 years 2 months ago
Differential Eligibility Vectors for Advantage Updating and Gradient Methods
In this paper we propose differential eligibility vectors (DEV) for temporal-difference (TD) learning, a new class of eligibility vectors designed to bring out the contribution of...
Francisco S. Melo
92
Voted
NIPS
2003
15 years 3 months ago
Approximate Policy Iteration with a Policy Language Bias
We study an approach to policy selection for large relational Markov Decision Processes (MDPs). We consider a variant of approximate policy iteration (API) that replaces the usual...
Alan Fern, Sung Wook Yoon, Robert Givan
123
Voted
NIPS
1998
15 years 3 months ago
Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms
In this paper, we address two issues of long-standing interest in the reinforcement learning literature. First, what kinds of performance guarantees can be made for Q-learning aft...
Michael J. Kearns, Satinder P. Singh