Sciweavers

ACL
2008

Evaluating Roget's Thesauri

14 years 1 months ago
Evaluating Roget's Thesauri
Roget's Thesaurus has gone through many revisions since it was first published 150 years ago. But how do these revisions affect Roget's usefulness for NLP? We examine the differences in content between the 1911 and 1987 versions of Roget's, and we test both versions with each other and WordNet on problems such as synonym identification and word relatedness. We also present a novel method for measuring sentence relatedness that can be implemented in either version of Roget's or in WordNet. Although the 1987 version of the Thesaurus is better, we show that the 1911 version performs surprisingly well and that often the differences between the versions of Roget's and WordNet are not statistically significant. We hope that this work will encourage others to use the 1911 Roget's Thesaurus in NLP tasks.
Alistair Kennedy, Stan Szpakowicz
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where ACL
Authors Alistair Kennedy, Stan Szpakowicz
Comments (0)