HyperLex: lexical cartography for information retrieval

15 years 6 months ago

Download sites.univ-provence.fr

This article describes an algorithm called HyperLex that is capable of automatically determining word uses in a textbase without recourse to a dictionary. The algorithm makes use of the specific properties of word cooccurrence graphs, which are shown as having "small world" properties. Unlike earlier dictionary-free methods based on word vectors, it can isolate highly infrequent uses (as rare as 1% of all occurrences) by detecting "hubs" and high-density components in the cooccurrence graphs. The algorithm is applied here to information retrieval on the Web, using a set of highly ambiguous test words. An evaluation of the algorithm showed that it only omitted a very small number of relevant uses. In addition, HyperLex offers automatic tagging of word uses in context with excellent precision (97%, compared to 73% for baseline tagging, with an 82% recall rate). Remarkably good precision (96%) was also achieved on a selection of the 25 most relevant pages for each use...

Jean Véronis

Real-time Traffic

Algorithm | Automated Reasoning | CSL 2004 | Earlier Dictionary-free Methods | Word Cooccurrence Graphs |

claim paper

Post Info
More Details (n/a)

Added	17 Dec 2010
Updated	17 Dec 2010
Type	Journal
Year	2004
Where	CSL
Authors	Jean Véronis

Comments (0)

Sciweavers

HyperLex: lexical cartography for information retrieval

Algorithm | Automated Reasoning | CSL 2004 | Earlier Dictionary-free Methods | Word Cooccurrence Graphs |

Explore & Download

Productivity Tools

Sciweavers