Sciweavers

567 search results - page 60 / 114
» Regularized Policy Iteration
Sort
View
MP
2011
12 years 10 months ago
An interior-point piecewise linear penalty method for nonlinear programming
We present an interior-point penalty method for nonlinear programming (NLP), where the merit function consists of a piecewise linear penalty function (PLPF) and an 2-penalty functi...
Lifeng Chen, Donald Goldfarb
HPCN
1997
Springer
13 years 11 months ago
Parallel Solution of Irregular, Sparse Matrix Problems Using High Performance Fortran
For regular, sparse, linear systems, like those derived from regular grids, using High Performance Fortran (HPF) for iterative solvers is straightforward. However, for irregular ma...
Eric de Sturler, Damian Loher
ICML
1999
IEEE
14 years 8 months ago
Least-Squares Temporal Difference Learning
Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...
Justin A. Boyan
NIPS
1998
13 years 9 months ago
Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms
In this paper, we address two issues of long-standing interest in the reinforcement learning literature. First, what kinds of performance guarantees can be made for Q-learning aft...
Michael J. Kearns, Satinder P. Singh
SISAP
2008
IEEE
98views Data Mining» more  SISAP 2008»
14 years 2 months ago
On Reinsertions in M-tree
In this paper we introduce a new M-tree building method, utilizing the classic idea of forced reinsertions. In case a leaf is about to split, some distant objects are removed from...
Jakub Lokoc, Tomás Skopal