Roget's thesaurus and semantic similarity

14 years 1 months ago

Download www.site.uottawa.ca

Roget’s Thesaurus has not been sufficiently appreciated in Natural Language Processing. We show that Roget's and WordNet are birds of a feather. In a few typical tests, we compare how the two resources help measure semantic similarity. One of the benchmarks is Miller and Charles’ list of 30 noun pairs to which human judges had assigned similarity measures. We correlate these measures with those computed by several NLP systems. The 30 pairs can be traced back to Rubenstein and Goodenough’s 65 pairs, which we have also studied. Our Roget’sbased system gets correlations of .878 for the smaller and .818 for the larger list of noun pairs; this is quite close to the .885 that Resnik obtained when he employed humans to replicate the Miller and Charles experiment. We further evaluate our measure by using Roget’s and WordNet to answer 80 TOEFL, 50 ESL and 300 Reader’s Digest questions: the correct synonym must be selected amongst a group of four words. Our system gets 78.75%, ...

Mario Jarmasz, Stan Szpakowicz

Real-time Traffic

Natural Language Processing | Noun Pairs | RANLP 2003 | Similarity Measures |

claim paper

Post Info
More Details (n/a)

Added	01 Nov 2010
Updated	01 Nov 2010
Type	Conference
Year	2003
Where	RANLP
Authors	Mario Jarmasz, Stan Szpakowicz

Comments (0)

Sciweavers

Roget's thesaurus and semantic similarity

Natural Language Processing | Noun Pairs | RANLP 2003 | Similarity Measures |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers