Search Sciweavers | Sciweavers

827 search results - page 86 / 166

» Variational methods for Reinforcement Learning

206

click to vote

ICML
2008
IEEE

117views Machine Learning» more ICML 2008»

Sample-based learning and search with permanent and transient memories

16 years 8 months ago

Download www.cs.ualberta.ca

We present a reinforcement learning architecture, Dyna-2, that encompasses both samplebased learning and sample-based search, and that generalises across states during both learni...

David Silver, Martin Müller 0003, Richard S. ...

claim paper

Read More »

202

click to vote

ECML
2007
Springer

192views Machine Learning» more ECML 2007»

Policy Gradient Critics

16 years 1 months ago

Download www.idsia.ch

We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...

Daan Wierstra, Jürgen Schmidhuber

claim paper

Read More »

168

click to vote

IROS
2007
IEEE

164views Robotics» more IROS 2007»

Emulation and behavior understanding through shared values

16 years 1 months ago

Download www.er.ams.eng.osaka-u.ac.jp

— Neurophysiology has revealed the existence of mirror neurons in brain of macaque monkeys and they shows similar activities during executing an observation of goal directed move...

Yasutake Takahashi, Teruyasu Kawamata, Minoru Asad...

claim paper

Read More »

225

click to vote

NIPS
2001

206views Information Technology» more NIPS 2001»

Model-Free Least-Squares Policy Iteration

15 years 8 months ago

Download www.cs.duke.edu

We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...

Michail G. Lagoudakis, Ronald Parr

claim paper

Read More »

179

click to vote

CVPR
2007
IEEE

179views Computer Vision» more CVPR 2007»

Discriminant Additive Tangent Spaces for Object Recognition

16 years 9 months ago

Download www.au.tsinghua.edu.cn

Pattern variation is a major factor that affects the performance of recognition systems. In this paper, a novel manifold tangent modeling method called Discriminant Additive Tange...

Liang Xiong, Jianguo Li, Changshui Zhang

claim paper

Read More »

« Prev « First page 86 / 166 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers