Large archives of Ottoman documents are challenging to many historians all over the world. However, these archives remain inaccessible since manual transcription of such a huge vo...
We study how to best use crowdsourced relevance judgments learning to rank [1, 7]. We integrate two lines of prior work: unreliable crowd-based binary annotation for binary classi...
This paper describes how use the Java Swing HTMLEditorKit to perform multi-threaded web data mining on the EDGAR system (Electronic DataGathering, Analysis, and Retrieval system)....
We propose a new probabilistic approach to information retrieval based upon the ideas and methods of statistical machine translation. The central ingredient in this approach is a ...
This paper presents an approach to automatically optimize the retrieval quality of ranking functions. Taking a Swarm Intelligence perspective, we present a novel method, SwarmRank...
Ernesto Diaz-Aviles, Wolfgang Nejdl, Lars Schmidt-...