Sciweavers

SIGIR
2010
ACM

Where to start filtering redundancy?: a cluster-based approach

14 years 3 months ago
Where to start filtering redundancy?: a cluster-based approach
Novelty detection is a difficult task, particularly at sentence level. Most of the approaches proposed in the past consist of re-ordering all sentences following their novelty scores. However, this re-ordering has usually little value. In fact, a naive baseline with no novelty detection capabilities yields often better performance than any state-of-the-art novelty detection mechanism. We argue here that this is because current methods initiate too early the novelty detection process. When few sentences have been seen, it is unlikely that the user is negatively affected by redundancy. Therefore, re-ordering the first sentences may be harmful in terms of performance. We propose here a query-dependent method based on cluster analysis to determine where we must start filtering redundancy. Categories and Subject Descriptors: H.3.3 [Information Search and Retrieval]: Information Filtering, Clustering, Retrieval Models General Terms: Experimentation
Ronald T. Fernández, Javier Parapar, David
Added 16 Aug 2010
Updated 16 Aug 2010
Type Conference
Year 2010
Where SIGIR
Authors Ronald T. Fernández, Javier Parapar, David E. Losada, Alvaro Barreiro
Comments (0)