Sciweavers

85 search results - page 10 / 17
» Approximate Policy Iteration with a Policy Language Bias
Sort
View
JMLR
2010
135views more  JMLR 2010»
13 years 2 months ago
Finite-sample Analysis of Bellman Residual Minimization
We consider the Bellman residual minimization approach for solving discounted Markov decision problems, where we assume that a generative model of the dynamics and rewards is avai...
Odalric-Ambrym Maillard, Rémi Munos, Alessa...
ICMLA
2008
13 years 8 months ago
Basis Function Construction in Reinforcement Learning Using Cascade-Correlation Learning Architecture
In reinforcement learning, it is a common practice to map the state(-action) space to a different one using basis functions. This transformation aims to represent the input data i...
Sertan Girgin, Philippe Preux
AIPS
2004
13 years 8 months ago
Learning Domain-Specific Control Knowledge from Random Walks
We describe and evaluate a system for learning domainspecific control knowledge. In particular, given a planning domain, the goal is to output a control policy that performs well ...
Alan Fern, Sung Wook Yoon, Robert Givan
CORR
2006
Springer
113views Education» more  CORR 2006»
13 years 7 months ago
A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD
This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...
Manuel Loth, Philippe Preux
ICMLA
2010
13 years 5 months ago
Ensembles of Neural Networks for Robust Reinforcement Learning
Reinforcement learning algorithms that employ neural networks as function approximators have proven to be powerful tools for solving optimal control problems. However, their traini...
Alexander Hans, Steffen Udluft