Sciweavers

GECCO
2006
Springer

Comparing evolutionary and temporal difference methods in a reinforcement learning domain

14 years 3 months ago
Comparing evolutionary and temporal difference methods in a reinforcement learning domain
Both genetic algorithms (GAs) and temporal difference (TD) methods have proven effective at solving reinforcement learning (RL) problems. However, since few rigorous empirical comparisons have been conducted, there are no general guidelines describing the methods' relative strengths and weaknesses. This paper presents the results of a detailed empirical comparison between a GA and a TD method in Keepaway, a standard RL benchmark domain based on robot soccer. In particular, we compare the performance of NEAT [19], a GA that evolves neural networks, with Sarsa [16, 17], a popular TD method. The results demonstrate that NEAT can learn better policies in this task, though it requires more evaluations to do so. Additional experiments in two variations of Keepaway demonstrate that Sarsa learns better policies when the task is fully observable and NEAT learns faster when the task is deterministic. Together, these results help isolate the factors critical to the performance of each metho...
Matthew E. Taylor, Shimon Whiteson, Peter Stone
Added 23 Aug 2010
Updated 23 Aug 2010
Type Conference
Year 2006
Where GECCO
Authors Matthew E. Taylor, Shimon Whiteson, Peter Stone
Comments (0)