Search Sciweavers | Sciweavers

132

ESANN
2003

134views Neural Networks» more ESANN 2003»

Autonomous learning algorithm for fully connected recurrent networks

15 years 6 months ago

In this paper fully connected RTRL neural networks are studied. In order to learn dynamical behaviours of linear-processes or to predict time series, an autonomous learning algori...

Edouard Leclercq, Fabrice Druaux, Dimitri Lefebvre

claim paper

Read More »

138

click to vote

ECAI
2008
Springer

124views Artificial Intelligence» more ECAI 2008»

Exploiting locality of interactions using a policy-gradient approach in multiagent learning

15 years 7 months ago

Download gaips.inesc-id.pt

In this paper, we propose a policy gradient reinforcement learning algorithm to address transition-independent Dec-POMDPs. This approach aims at implicitly exploiting the locality...

Francisco S. Melo

claim paper

Read More »

146

click to vote

NIPS
2001

144views Information Technology» more NIPS 2001»

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

15 years 6 months ago

Download jmlr.csail.mit.edu

Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...

Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...

claim paper

Read More »

147

click to vote

JMLR
2006

143views more JMLR 2006»

Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation

15 years 5 months ago

Download www.aaai.org

We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...

Rémi Munos

claim paper

Read More »

172

click to vote

EWRL
2008

148views Machine Learning» more EWRL 2008»

Policy Learning - A Unified Perspective with Applications in Robotics

15 years 7 months ago

Download www.kyb.tuebingen.mpg.de

Policy Learning approaches are among the best suited methods for high-dimensional, continuous control systems such as anthropomorphic robot arms and humanoid robots. In this paper,...

Jan Peters, Jens Kober, Duy Nguyen-Tuong

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers