Search Sciweavers | Sciweavers

A new training algorithm is presented for delayed reinforcement learning problems that does not assume the existence of a critic model and employs the polytope optimization algorit...

Aristidis Likas, Isaac E. Lagaris

claim paper

Read More »

158

click to vote

AAAI
2010

161views Intelligent Agents» more AAAI 2010»

Learning Methods to Generate Good Plans: Integrating HTN Learning and Reinforcement Learning

15 years 8 months ago

Download www.cse.lehigh.edu

Chad Hogg, Ugur Kuter, Hector Muñoz-Avila

claim paper

Read More »

178

click to vote

ICML
2001
IEEE

185views Machine Learning» more ICML 2001»

Off-Policy Temporal Difference Learning with Function Approximation

16 years 7 months ago

Download www.cs.ualberta.ca

We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...

Doina Precup, Richard S. Sutton, Sanjoy Dasgupta

claim paper

Read More »

« Prev « First page 4 / 166 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers