Sciweavers

162 search results - page 18 / 33
» Off-Policy Temporal Difference Learning with Function Approx...
Sort
View
IJCAI
2007
15 years 3 months ago
Reinforcement Learning of Local Shape in the Game of Go
We explore an application to the game of Go of a reinforcement learning approach based on a linear evaluation function and large numbers of binary features. This strategy has prov...
David Silver, Richard S. Sutton, Martin Mülle...
ICMLA
2007
15 years 3 months ago
Control of a re-entrant line manufacturing model with a reinforcement learning approach
This paper presents the application of a reinforcement learning (RL) approach for the near-optimal control of a re-entrant line manufacturing (RLM) model. The RL approach utilizes...
José A. Ramírez-Hernández, Em...
ATAL
2009
Springer
15 years 9 months ago
An empirical analysis of value function-based and policy search reinforcement learning
In several agent-oriented scenarios in the real world, an autonomous agent that is situated in an unknown environment must learn through a process of trial and error to take actio...
Shivaram Kalyanakrishnan, Peter Stone
ICML
2007
IEEE
16 years 3 months ago
Bayesian actor-critic algorithms
We1 present a new actor-critic learning model in which a Bayesian class of non-parametric critics, using Gaussian process temporal difference learning is used. Such critics model ...
Mohammad Ghavamzadeh, Yaakov Engel
ML
2007
ACM
106views Machine Learning» more  ML 2007»
15 years 1 months ago
Surrogate maximization/minimization algorithms and extensions
Abstract Surrogate maximization (or minimization) (SM) algorithms are a family of algorithms that can be regarded as a generalization of expectation-maximization (EM) algorithms. A...
Zhihua Zhang, James T. Kwok, Dit-Yan Yeung