Search Sciweavers | Sciweavers

431 search results - page 13 / 87

» Learning to use episodic memory

141

click to vote

ICANN
2009
Springer

113views Neural Networks» more ICANN 2009»

Evolving Memory Cell Structures for Sequence Learning

16 years 18 days ago

Download julian.togelius.com

The best recent supervised sequence learning methods use gradient descent to train networks of miniature nets called memory cells. The most popular cell structure seems somewhat ar...

Justin Bayer, Daan Wierstra, Julian Togelius, J&uu...

claim paper

Read More »

179

click to vote

NIPS
2008

110views Information Technology» more NIPS 2008»

Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms

15 years 7 months ago

Download groups.csail.mit.edu

Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...

John W. Roberts, Russ Tedrake

claim paper

Read More »

182

click to vote

GECCO
2005
Springer

150views Optimization» more GECCO 2005»

Population-based incremental learning with memory scheme for changing environments

15 years 11 months ago

Download www.cs.bham.ac.uk

In recent years there has been a growing interest in studying evolutionary algorithms for dynamic optimization problems due to its importance in real world applications. Several a...

Shengxiang Yang

claim paper

Read More »

198

click to vote

TEC
2008

165views more TEC 2008»

Population-Based Incremental Learning With Associative Memory for Dynamic Environments

15 years 6 months ago

Download www.cs.bham.ac.uk

In recent years, interest in studying evolutionary algorithms (EAs) for dynamic optimization problems (DOPs) has grown due to its importance in real-world applications. Several app...

Shengxiang Yang, Xin Yao

claim paper

Read More »

185

click to vote

ICML
2000
IEEE

153views Machine Learning» more ICML 2000»

Eligibility Traces for Off-Policy Policy Evaluation

16 years 6 months ago

Download www.cs.ualberta.ca

Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference meth...

Doina Precup, Richard S. Sutton, Satinder P. Singh

claim paper

Read More »

« Prev « First page 13 / 87 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers