Sciweavers

SIGIR
2004
ACM

An effective approach to document retrieval via utilizing WordNet and recognizing phrases

14 years 5 months ago
An effective approach to document retrieval via utilizing WordNet and recognizing phrases
Noun phrases in queries are identified and classified into four types: proper names, dictionary phrases, simple phrases and complex phrases. A document has a phrase if all content words in the phrase are within a window of a certain size. The window sizes for different types of phrases are different and are determined using a decision tree. Phrases are more important than individual terms. Consequently, documents in response to a query are ranked with matching phrases given a higher priority. We utilize WordNet to disambiguate word senses of query terms. Whenever the sense of a query term is determined, its synonyms, hyponyms, words from its definition and its compound words are considered for possible additions to the query. Experimental results show that our approach yields between 23% and 31% improvements over the best-known results on the TREC 9, 10 and 12 collections for short (title only) queries, without using Web data. Categories and Subject Descriptors H.3.3 [Information Stor...
Shuang Liu, Fang Liu, Clement T. Yu, Weiyi Meng
Added 30 Jun 2010
Updated 30 Jun 2010
Type Conference
Year 2004
Where SIGIR
Authors Shuang Liu, Fang Liu, Clement T. Yu, Weiyi Meng
Comments (0)