Comparing evolutionary and temporal difference methods in a reinforcement learning domain

14 years 4 months ago

Download www.cs.bham.ac.uk

Both genetic algorithms (GAs) and temporal difference (TD) methods have proven effective at solving reinforcement learning (RL) problems. However, since few rigorous empirical comparisons have been conducted, there are no general guidelines describing the methods' relative strengths and weaknesses. This paper presents the results of a detailed empirical comparison between a GA and a TD method in Keepaway, a standard RL benchmark domain based on robot soccer. In particular, we compare the performance of NEAT [19], a GA that evolves neural networks, with Sarsa [16, 17], a popular TD method. The results demonstrate that NEAT can learn better policies in this task, though it requires more evaluations to do so. Additional experiments in two variations of Keepaway demonstrate that Sarsa learns better policies when the task is fully observable and NEAT learns faster when the task is deterministic. Together, these results help isolate the factors critical to the performance of each metho...

Matthew E. Taylor, Shimon Whiteson, Peter Stone

Real-time Traffic

Empirical Comparison | GECCO 2006 | Genetic Algorithms | Optimization | TD Method |

claim paper

Post Info
More Details (n/a)

Added	23 Aug 2010
Updated	23 Aug 2010
Type	Conference
Year	2006
Where	GECCO
Authors	Matthew E. Taylor, Shimon Whiteson, Peter Stone

Comments (0)

Sciweavers

Comparing evolutionary and temporal difference methods in a reinforcement learning domain

Empirical Comparison | GECCO 2006 | Genetic Algorithms | Optimization | TD Method |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers