Representative-based clustering algorithms are quite popular due to their relative high speed and because of their sound theoretical foundation. On the other hand, the clusters the...
Abstract. A method for measuring the density of data sets that contain an unknown number of clusters of unknown sizes is proposed. This method, called Pareto Density Estimation (PD...
Subspace clustering has attracted great attention due to its capability of finding salient patterns in high dimensional data. Order preserving subspace clusters have been proven to...
P2P systems represent a large portion of the Internet traffic which makes the data discovery of great importance to the user and the broad Internet community. Hence, the power of ...
High dimensional directional data is becoming increasingly important in contemporary applications such as analysis of text and gene-expression data. A natural model for multivaria...
Arindam Banerjee, Inderjit S. Dhillon, Joydeep Gho...