The Rocchio relevance feedback algorithm is one of the most popular and widely applied learning methods from information retrieval. Here, a probabilistic analysis of this algorith...
Due to the globalization on the Web, many companies and institutions need to efficiently organize and search repositories containing multilingual documents. The management of the...
Abstract. A major characteristic of text document categorization problems is the extremely high dimensionality of text data. In this paper we explore the usability of the Oscillati...
The number of software systems is increasing at a rapid rate. For example, SourceForge currently has about sixty thousand software systems registered, twenty-two thousand of which...
Shinji Kawaguchi, Pankaj K. Garg, Makoto Matsushit...
We present a novel algorithm called Clicks, that finds clusters in categorical datasets based on a search for k-partite maximal cliques. Unlike previous methods, Clicks mines subs...
Mohammed Javeed Zaki, Markus Peters, Ira Assent, T...