Search Sciweavers | Sciweavers

510 search results - page 13 / 102

» Gradient Estimation Revitalized

136

Voted

IJCAI
2001

163views Artificial Intelligence» more IJCAI 2001»

Exploiting Multiple Secondary Reinforcers in Policy Gradient Reinforcement Learning

15 years 4 months ago

Download www.cs.colorado.edu

Most formulations of Reinforcement Learning depend on a single reinforcement reward value to guide the search for the optimal policy solution. If observation of this reward is rar...

Gregory Z. Grudic, Lyle H. Ungar

claim paper

Read More »

121

Voted

JMLR
2010

105views more JMLR 2010»

On the Convergence Properties of Contrastive Divergence

14 years 9 months ago

Download www.cs.utoronto.ca

Contrastive Divergence (CD) is a popular method for estimating the parameters of Markov Random Fields (MRFs) by rapidly approximating an intractable term in the gradient of the lo...

Ilya Sutskever, Tijmen Tieleman

claim paper

Read More »

111

Voted

SIAMJO
2010

128views more SIAMJO 2010»

Solving Log-Determinant Optimization Problems by a Newton-CG Primal Proximal Point Algorithm

14 years 9 months ago

Download www.math.nus.edu.sg

We propose a Newton-CG primal proximal point algorithm for solving large scale log-determinant optimization problems. Our algorithm employs the essential ideas of the proximal poi...

Chengjing Wang, Defeng Sun, Kim-Chuan Toh

claim paper

Read More »

102

Voted

UAI
2001

129views Artificial Intelligence» more UAI 2001»

The Optimal Reward Baseline for Gradient-Based Reinforcement Learning

15 years 4 months ago

Download cs.anu.edu.au

There exist a number of reinforcement learning algorithms which learn by climbing the gradient of expected reward. Their long-run convergence has been proved, even in partially ob...

Lex Weaver, Nigel Tao

claim paper

Read More »

140

click to vote

ISBI
2004
IEEE

150views Medical Imaging» more ISBI 2004»

A Fast Fully 4D Incremental Gradient Reconstruction Algorithm for List Mode PET Data

16 years 3 months ago

Download neuroimage.usc.edu

We present a fully four-dimensional, globally convergent, incremental gradient algorithm to estimate the continuous-time tracer density from list mode positron emission tomography...

Quanzheng Li, Evren Asma, Richard M. Leahy

claim paper

Read More »

« Prev « First page 13 / 102 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers