Spam filtering poses a special problem in text categorization, of which the defining characteristic is that filters face an active adversary, which constantly attempts to evade fi...
Andrej Bratko, Gordon V. Cormack, Bogdan Filipic, ...
Multinomial distributions over words are frequently used to model topics in text collections. A common, major challenge in applying all such topic models to any text mining proble...
In many Web applications, such as blog classification and newsgroup classification, labeled data are in short supply. It often happens that obtaining labeled data in a new domain ...
Label ranking is the task of inferring a total order over a predefined set of labels for each given instance. We present a general framework for batch learning of label ranking f...
The classical (ad hoc) document retrieval problem has been traditionally approached through ranking according to heuristically developed functions (such as tf.idf or bm25) or gene...