Efficient discovery of frequent patterns from large databases is an active research area in data mining with broad applications in industry and deep implications in many areas of d...
We develop the notion of normalized information distance (NID) [7] into a kernel distance suitable for use with a Support Vector Machine classifier, and demonstrate its use for an...
This paper presents a study on the combination of different classifiers for toxicity prediction. Two combination operators for the Multiple-Classifier System definition are also pr...
If Kolmogorov complexity [25] measures information in one object and Information Distance [4, 23, 24, 42] measures information shared by two objects, how do we measure information...
Dimensionality reduction plays an important role in many data mining applications involving high-dimensional data. Many existing dimensionality reduction techniques can be formula...
We consider the problem of learning incoherent sparse and lowrank patterns from multiple tasks. Our approach is based on a linear multi-task learning formulation, in which the spa...
This paper introduces mass estimation—a base modelling mechanism in data mining. It provides the theoretical basis of mass and an efficient method to estimate mass. We show that...
Kai Ming Ting, Guang-Tong Zhou, Fei Tony Liu, Jame...
Advances in data collection and storage capacity have made it increasingly possible to collect highly volatile graph data for analysis. Existing graph analysis techniques are not ...
Keith Henderson, Tina Eliassi-Rad, Christos Falout...
In this paper, we present a novel exploratory visual analytic system called TIARA (Text Insight via Automated Responsive Analytics), which combines text analytics and interactive ...