Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the databas...
This paper studies the effects of training data on binary text classification and postulates that negative training data is not needed and may even be harmful for the task. Tradit...
Abstract. In text classification (TC) and other tasks involving supervised learning, labelled data may be scarce or expensive to obtain; strategies are thus needed for maximizing t...
Bayesian text classifiers face a common issue which is referred to as data sparsity problem, especially when the size of training data is very small. The frequently used Laplacian...
In the blogosphere, the amount of digital content is expanding and for search engines, new challenges have been imposed. Due to the changing information need, automatic methods are...
Elisabeth Lex, Andreas Juffinger, Michael Granitze...