In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
We present a static index pruning method, to be used in ad-hoc document retrieval tasks, that follows a documentcentric approach to decide whether a posting for a given term shoul...
This paper proposes new extensions of the digital book concept together with the required approaches to support their automatic generation. Most best-sellers have often inspired o...
Many have speculated that classifying web pages can improve a search engine's ranking of results. Intuitively results should be more relevant when they match the class of a q...
Paul N. Bennett, Krysta Marie Svore, Susan T. Duma...
The paper first deals with the much discussed issue of defining the term informatics. Then it explores how the term information science is understood by the students of the Facult...
Samuel Driessen, Willem-Olaf Huijsen, Marjan Groot...