Sciweavers

60 search results - page 9 / 12
» Document overlap detection system for distributed digital li...
Sort
View
AND
2010
15 years 1 months ago
Reshaping automatic speech transcripts for robust high-level spoken document analysis
High-level spoken document analysis is required in many applications seeking access to the semantic content of audio data, such as information retrieval, machine translation or au...
Julien Fayolle, Fabienne Moreau, Christian Raymond...
129
Voted
SIGIR
2004
ACM
15 years 9 months ago
Length normalization in XML retrieval
XML retrieval is a departure from standard document retrieval in which each individual XML element, ranging from italicized words or phrases to full blown articles, is a potential...
Jaap Kamps, Maarten de Rijke, Börkur Sigurbj&...
167
Voted
SIGIR
2008
ACM
15 years 3 months ago
SpotSigs: robust and efficient near duplicate detection in large web collections
Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...
Martin Theobald, Jonathan Siddharth, Andreas Paepc...
141
Voted
CIKM
2010
Springer
15 years 2 months ago
Learning to rank relevant and novel documents through user feedback
We consider the problem of learning to rank relevant and novel documents so as to directly maximize a performance metric called Expected Global Utility (EGU), which has several de...
Abhimanyu Lad, Yiming Yang
139
Voted
ICIP
2002
IEEE
16 years 5 months ago
Fuzzy color signatures
With the large and increasing amount of visual information available in digital libraries and the Web, efficient and robust systems for image retrieval are urgently needed. In thi...
Andrés Dorado, Ebroul Izquierdo