Sciweavers

132 search results - page 17 / 27
» Generalization in Reinforcement Learning: Safely Approximati...
Sort
View
133
Voted
ICML
2002
IEEE
16 years 4 months ago
Hierarchically Optimal Average Reward Reinforcement Learning
Two notions of optimality have been explored in previous work on hierarchical reinforcement learning (HRL): hierarchical optimality, or the optimal policy in the space defined by ...
Mohammad Ghavamzadeh, Sridhar Mahadevan
IWLCS
2005
Springer
15 years 9 months ago
Counter Example for Q-Bucket-Brigade Under Prediction Problem
Aiming to clarify the convergence or divergence conditions for Learning Classifier System (LCS), this paper explores: (1) an extreme condition where the reinforcement process of ...
Atsushi Wada, Keiki Takadama, Katsunori Shimohara
ICML
2010
IEEE
15 years 5 months ago
Finite-Sample Analysis of LSTD
In this paper we consider the problem of policy evaluation in reinforcement learning, i.e., learning the value function of a fixed policy, using the least-squares temporal-differe...
Alessandro Lazaric, Mohammad Ghavamzadeh, Ré...
NIPS
1997
15 years 5 months ago
Generalized Prioritized Sweeping
Prioritized sweeping is a model-based reinforcement learning method that attempts to focus an agent’s limited computational resources to achieve a good estimate of the value of ...
David Andre, Nir Friedman, Ronald Parr
140
Voted
ILP
2003
Springer
15 years 9 months ago
Graph Kernels and Gaussian Processes for Relational Reinforcement Learning
RRL is a relational reinforcement learning system based on Q-learning in relational state-action spaces. It aims to enable agents to learn how to act in an environment that has no ...
Thomas Gärtner, Kurt Driessens, Jan Ramon