The simple access to texts on digital libraries and the WWW has led to an increased number of plagiarism cases in recent years, which renders manual plagiarism detection infeasibl...
We present a document routing and index partitioning scheme for scalable similarity-based search of documents in a large corpus. We consider the case when similarity-based search ...
Background: The MEDLINE database contains over 12 million references to scientific literature, ut 3/4 of recent articles including an abstract of the publication. Retrieval of ent...
K-Means clustering is widely used in information retrieval and data mining. Distributed K-Means variants have already been proposed, but none of the past algorithms scales to large...
Odysseas Papapetrou, Wolf Siberski, Fabian Leitrit...
For this year's Image CLEF Photo Retrieval task, we have prepared 5 submission runs to help us assess the effectiveness of 1) image content-based retrieval, and 2) textbased ...