Many important application areas of text classifiers demand high precision and it is common to compare prospective solutions to the performance of Naive Bayes. This baseline is us...
Near-duplicate detection is not only an important pre and post processing task in Information Retrieval but also an effective spam-detection technique. Among different approache...
This paper focuses on decentralized personalized search engines. It is composed of three parts. Firstly, we formulate the problem and we propose a graph-based measure of quality o...
An essential part of an expert-finding task, such as matching reviewers to submitted papers, is the ability to model the expertise of a person based on documents. We evaluate seve...
The project fragmentation problem in personal information management occurs when someone who is working on a single project stores and retrieves information items relating to that...