Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...
Martin Theobald, Jonathan Siddharth, Andreas Paepc...
As the number and size of large timestamped collections (e.g. sequences of digitized newspapers, periodicals, blogs) increase, the problem of efficiently indexing and searching su...
Theodoros Lappas, Benjamin Arai, Manolis Platakis,...
Versioned document collections are collections that contain multiple versions of each document. Important examples are Web archives, Wikipedia and other wikis, or source code and ...
We consider the following autocompletion search scenario: imagine a user of a search engine typing a query; then with every keystroke display those completions of the last query wo...
Holger Bast, Christian Worm Mortensen, Ingmar Webe...
Test collections are the primary drivers of progress in information retrieval. They provide a yardstick for assessing the effectiveness of ranking functions in an automatic, rapi...
Nima Asadi, Donald Metzler, Tamer Elsayed, Jimmy L...