Sciweavers

311 search results - page 9 / 63
» Gradient Convergence in Gradient methods with Errors
Sort
View
IJCAI
2001
13 years 9 months ago
Exploiting Multiple Secondary Reinforcers in Policy Gradient Reinforcement Learning
Most formulations of Reinforcement Learning depend on a single reinforcement reward value to guide the search for the optimal policy solution. If observation of this reward is rar...
Gregory Z. Grudic, Lyle H. Ungar
MVA
2002
177views Computer Vision» more  MVA 2002»
13 years 7 months ago
Global Motion Estimation Based on the Constrained Spatio-temporal Gradient Method in Model-Based Coding
For global motion estimation in model-based coding, this paper proposes a constrained spatio-temporal gradient method using contour information. To overcome the local minimum prob...
Young Wook Sohn, Doo-Hyun Kim, Dong-O Kim, Rae-Hon...
ICMLA
2010
13 years 5 months ago
Multimodal Parameter-exploring Policy Gradients
Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...
Frank Sehnke, Alex Graves, Christian Osendorfer, J...
FOCM
2008
140views more  FOCM 2008»
13 years 7 months ago
Online Gradient Descent Learning Algorithms
This paper considers the least-square online gradient descent algorithm in a reproducing kernel Hilbert space (RKHS) without explicit regularization. We present a novel capacity i...
Yiming Ying, Massimiliano Pontil
IROS
2006
IEEE
113views Robotics» more  IROS 2006»
14 years 1 months ago
Policy Gradient Methods for Robotics
— The aquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely pre-struc...
Jan Peters, Stefan Schaal