Sciweavers

ESANN
2007
14 years 2 months ago
Replacing eligibility trace for action-value learning with function approximation
The eligibility trace is one of the most used mechanisms to speed up reinforcement learning. Earlier reported experiments seem to indicate that replacing eligibility traces would p...
Kary Främling
ATAL
2009
Springer
14 years 7 months ago
SarsaLandmark: an algorithm for learning in POMDPs with landmarks
Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in pa...
Michael R. James, Satinder P. Singh
ISDA
2009
IEEE
14 years 7 months ago
Postponed Updates for Temporal-Difference Reinforcement Learning
This paper presents postponed updates, a new strategy for TD methods that can improve sample efficiency without incurring the computational and space requirements of model-based ...
Harm van Seijen, Shimon Whiteson