The aim of query-based sampling is to obtain a sufficient, representative sample of an underlying (text) collection. Current measures for assessing sample quality are too coarse gr...
: We propose a set of statistical metrics for making a comprehensive, fair, and insightful evaluation of features, clustering algorithms, and distance measures in representative sa...
— To enhance the generalization capacity of a distribution learning method, we propose to use a fuzzy Bayesian framework based on Bayes rules. The precision of the learning resul...
As information networks become ubiquitous, extracting knowledge from information networks has become an important task. Both ranking and clustering can provide overall views on in...
In this paper, we propose a fast, memory-efficient, and scalable clustering algorithm for analyzing transactional data. Our approach has three unique features. First, we use the c...