Currently in document retrieval there are many algorithms each with different strengths and weakness. There is some difficulty, however, in evaluating the impact of the test quer...
Automatically generated HTML, as produced by WYSIWYG programs, typically contains much repetitive and unnecessary markup. This paper identifies aspects of such HTML that may be al...
The rapid growth of the web has been noted and tracked extensively. Recent studies have however documented the dual phenomenon: web pages have small half lives, and thus the web e...
Ziv Bar-Yossef, Andrei Z. Broder, Ravi Kumar, Andr...
In this paper, we describe UNED’s participation in the iCLEF 2005 track. We have compared two strategies for finding an answer using an interactive question answering system: i...
The TREC 2004 Terabyte Track evaluated information retrieval in largescale text collections, using a set of 25 million documents (426 GB). This paper gives an overview of our expe...