Traditional similarity or distance measurements usually become meaningless when the dimensions of the datasets increase, which has detrimental effects on clustering performance. I...
We propose a new method to partition an unlabeled dataset, called Discriminative Context Partitioning (DCP). It is motivated by the idea of splitting the dataset based only on how...
The primary constraint in the effective mining of data streams is the large volume of data which must be processed in real time. In many cases, it is desirable to store a summary...
In this work, a new algorithm is proposed for fast estimation of nonparametric multivariate kernel density, based on principal direction divisive partitioning (PDDP) of the data s...