Sciweavers

WEBI
2009
Springer

Full-Subtopic Retrieval with Keyphrase-Based Search Results Clustering

14 years 6 months ago
Full-Subtopic Retrieval with Keyphrase-Based Search Results Clustering
We consider the problem of retrieving multiple documents relevant to the single subtopics of a given web query, termed “full-subtopic retrieval”. To solve this problem we present a novel search results clustering algorithm that generates clusters labeled by keyphrases. The keyphrases are extracted from the generalized suffix tree built from the search results and merged through an improved hierarchical agglomerative clustering procedure. We also introduce a novel measure for evaluating full-subtopic retrieval performance, namely “Subtopic Search Length under k document sufficiency”. Using a test collection specifically designed for evaluating subtopic retrieval, we found that our algorithm outperformed both other existing search results clustering algorithms and also a search results re-ranking method that emphasized diversity of results (at least for k>1; i.e., when we are interested in retrieving more than one relevant document per subtopic). Our approach has been impl...
Andrea Bernardini, Claudio Carpineto, Massimiliano
Added 25 May 2010
Updated 25 May 2010
Type Conference
Year 2009
Where WEBI
Authors Andrea Bernardini, Claudio Carpineto, Massimiliano D'Amico
Comments (0)