Sciweavers

AAAI
2008

Concept-Based Feature Generation and Selection for Information Retrieval

14 years 2 months ago
Concept-Based Feature Generation and Selection for Information Retrieval
Traditional information retrieval systems use query words to identify relevant documents. In difficult retrieval tasks, however, one needs access to a wealth of background knowledge. We present a method that uses Wikipedia-based feature generation to improve retrieval performance. Intuitively, we expect that using extensive world knowledge is likely to improve recall but may adversely affect precision. High quality feature selection is necessary to maintain high precision, but here we do not have the labeled training data for evaluating features, that we have in supervised learning. We present a new feature selection method that is inspired by pseudorelevance feedback. We use the top-ranked and bottomranked documents retrieved by the bag-of-words method as representative sets of relevant and non-relevant documents. The generated features are then evaluated and filtered on the basis of these sets. Experiments on TREC data confirm the superior performance of our method compared to the p...
Ofer Egozi, Evgeniy Gabrilovich, Shaul Markovitch
Added 02 Oct 2010
Updated 02 Oct 2010
Type Conference
Year 2008
Where AAAI
Authors Ofer Egozi, Evgeniy Gabrilovich, Shaul Markovitch
Comments (0)