Sciweavers

162 search results - page 9 / 33
» Off-Policy Temporal Difference Learning with Function Approx...
Sort
View
162
Voted
NCA
2008
IEEE
15 years 2 months ago
Neurodynamic programming: a case study of the traveling salesman problem
The paper focuses on the study of solving the large-scale traveling salesman problem (TSP) based on neurodynamic programming. From this perspective, two methods, temporal differenc...
Jia Ma, Tao Yang, Zeng-Guang Hou, Min Tan, Derong ...
ECAI
2006
Springer
15 years 6 months ago
Least Squares SVM for Least Squares TD Learning
Abstract. We formulate the problem of least squares temporal difference learning (LSTD) in the framework of least squares SVM (LS-SVM). To cope with the large amount (and possible ...
Tobias Jung, Daniel Polani
ATAL
2008
Springer
15 years 4 months ago
Sigma point policy iteration
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
CVPR
2006
IEEE
16 years 4 months ago
The Function Space of an Activity
An activity consists of an actor performing a series of actions in a pre-defined temporal order. An action is an individual atomic unit of an activity. Different instances of the ...
Ashok Veeraraghavan, Amit K. Roy Chowdhury
217
Voted

Book
796views
17 years 1 months ago
Introduction to Machine Learning
This is an introductory book about machine learning. Notice that this is a draft book. It may contain typos, mistakes, etc. The book covers the following topics: Boolean Functio...
Nils J. Nilsson