The Rocchio relevance feedback algorithm is one of the most popular and widely applied learning methods from information retrieval. Here, a probabilistic analysis of this algorith...
In this paper, we address a relatively new and interesting text categorization problem: classify a political blog as either liberal or conservative, based on its political leaning...
Term-based representations of documents have found widespread use in information retrieval. However, one of the main shortcomings of such methods is that they largely disregard le...
In this paper we propose a multimedia categorization framework that is able to exploit information across different parts of a multimedia document (e.g., a Web page, a PDF, a Micr...
Abstract-- Text categorization is the task of assigning predefined categories to natural language text. With the widely used `bag of words' representation, previous researches...