Sciweavers

SIGMOD
1998
ACM

Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications

14 years 3 months ago
Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications
Data mining applications place special requirements on clustering algorithms including: the ability to nd clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the order of input records. We present CLIQUE, a clustering algorithm that satis es each of these requirements. CLIQUE identi es dense clusters in subspaces of maximum dimensionality. It generates cluster descriptions in the form of DNF expressions that are minimized for ease of comprehension. It produces identical results irrespective of the order in which input records are presented and does not presume any speci c mathematical form for data distribution. Through experiments, we show that CLIQUE e ciently nds accurate clusters in large high dimensional datasets.
Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopul
Added 05 Aug 2010
Updated 05 Aug 2010
Type Conference
Year 1998
Where SIGMOD
Authors Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopulos, Prabhakar Raghavan
Comments (0)