Search Sciweavers | Sciweavers

340 search results - page 25 / 68

» Kernelized value function approximation for reinforcement le...

103

click to vote

ICML
2003
IEEE

146views Machine Learning» more ICML 2003»

TD(0) Converges Provably Faster than the Residual Gradient Algorithm

16 years 3 months ago

Download www.hpl.hp.com

In Reinforcement Learning (RL) there has been some experimental evidence that the residual gradient algorithm converges slower than the TD(0) algorithm. In this paper, we use the ...

Ralf Schoknecht, Artur Merke

claim paper

Read More »

128

Voted

NIPS
2004

137views Information Technology» more NIPS 2004»

Brain Inspired Reinforcement Learning

15 years 3 months ago

Download books.nips.cc

Successful application of reinforcement learning algorithms often involves considerable hand-crafting of the necessary non-linear features to reduce the complexity of the value fu...

François Rivest, Yoshua Bengio, John Kalask...

claim paper

Read More »

127

click to vote

IROS
2006
IEEE

190views Robotics» more IROS 2006»

Q-RAN: A Constructive Reinforcement Learning Approach for Robot Behavior Learning

15 years 8 months ago

Download www.aass.oru.se

Abstract— This paper presents a learning system that uses Qlearning with a resource allocating network (RAN) for behavior learning in mobile robotics. The RAN is used as a functi...

Jun Li, Achim J. Lilienthal, Tomás Mart&iac...

claim paper

Read More »

128

click to vote

NIPS
1998

137views Information Technology» more NIPS 1998»

Risk Sensitive Reinforcement Learning

15 years 3 months ago

Download www.cs.cmu.edu

In this paper, we consider Markov Decision Processes (MDPs) with error states. Error states are those states entering which is undesirable or dangerous. We define the risk with re...

Ralph Neuneier, Oliver Mihatsch

claim paper

Read More »

129

click to vote

ICML
2010
IEEE

231views Machine Learning» more ICML 2010»

Toward Off-Policy Learning Control with Function Approximation

15 years 3 months ago

Download www.sztaki.hu

We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...

Hamid Reza Maei, Csaba Szepesvári, Shalabh ...

claim paper

Read More »

« Prev « First page 25 / 68 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers