Evaluating Roget's Thesauri

14 years 1 months ago

Download www.aclweb.org

Roget's Thesaurus has gone through many revisions since it was first published 150 years ago. But how do these revisions affect Roget's usefulness for NLP? We examine the differences in content between the 1911 and 1987 versions of Roget's, and we test both versions with each other and WordNet on problems such as synonym identification and word relatedness. We also present a novel method for measuring sentence relatedness that can be implemented in either version of Roget's or in WordNet. Although the 1987 version of the Thesaurus is better, we show that the 1911 version performs surprisingly well and that often the differences between the versions of Roget's and WordNet are not statistically significant. We hope that this work will encourage others to use the 1911 Roget's Thesaurus in NLP tasks.

Alistair Kennedy, Stan Szpakowicz

Real-time Traffic

ACL 2008 | Computational Linguistics | Revisions Affect Roget | Roget's | Roget's Thesaurus |

claim paper

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	ACL
Authors	Alistair Kennedy, Stan Szpakowicz

Comments (0)

Sciweavers

Evaluating Roget's Thesauri

ACL 2008 | Computational Linguistics | Revisions Affect Roget | Roget's | Roget's Thesaurus |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers