Sciweavers

510 search results - page 13 / 102
» Gradient Estimation Revitalized
Sort
View
IJCAI
2001
13 years 11 months ago
Exploiting Multiple Secondary Reinforcers in Policy Gradient Reinforcement Learning
Most formulations of Reinforcement Learning depend on a single reinforcement reward value to guide the search for the optimal policy solution. If observation of this reward is rar...
Gregory Z. Grudic, Lyle H. Ungar
JMLR
2010
105views more  JMLR 2010»
13 years 4 months ago
On the Convergence Properties of Contrastive Divergence
Contrastive Divergence (CD) is a popular method for estimating the parameters of Markov Random Fields (MRFs) by rapidly approximating an intractable term in the gradient of the lo...
Ilya Sutskever, Tijmen Tieleman
SIAMJO
2010
128views more  SIAMJO 2010»
13 years 4 months ago
Solving Log-Determinant Optimization Problems by a Newton-CG Primal Proximal Point Algorithm
We propose a Newton-CG primal proximal point algorithm for solving large scale log-determinant optimization problems. Our algorithm employs the essential ideas of the proximal poi...
Chengjing Wang, Defeng Sun, Kim-Chuan Toh
UAI
2001
13 years 11 months ago
The Optimal Reward Baseline for Gradient-Based Reinforcement Learning
There exist a number of reinforcement learning algorithms which learn by climbing the gradient of expected reward. Their long-run convergence has been proved, even in partially ob...
Lex Weaver, Nigel Tao
ISBI
2004
IEEE
14 years 10 months ago
A Fast Fully 4D Incremental Gradient Reconstruction Algorithm for List Mode PET Data
We present a fully four-dimensional, globally convergent, incremental gradient algorithm to estimate the continuous-time tracer density from list mode positron emission tomography...
Quanzheng Li, Evren Asma, Richard M. Leahy