Hierarchical model-based clustering of large datasets through fractionation and refractionation

16 years 8 months ago

$Hierarchical model-based clustering of large datasets through fractionation and refractionation$

Download www.stat.washington.edu

The goal of clustering is to identify distinct groups in a dataset. Compared to non-parametric clustering methods like complete linkage, hierarchical model-based clustering has the advantage of offering a way to estimate the number of groups present in the data. However, its computational cost is quadratic in the number of items to be clustered, and it is therefore not applicable to large problems. We review an idea called Fractionation, originally conceived by Cutting, Karger, Pedersen and Tukey for non-parametric hierarchical clustering of large datasets, and describe an adaptation of Fractionation to model-based clustering. A further extension, called Refractionation, leads to a procedure that can be successful even in the difficult situation where there are large numbers of small groups. Supported by NSA grant 62-1942. Supported by NSF grant DMS-9803226 and NSA grant 62-194 2. 1

Jeremy Tantrum, Alejandro Murua, Werner Stuetzle

Real-time Traffic

Data Mining | Hierarchical Model-based Clustering | KDD 2002 | Non-parametric Clustering Methods | Non-parametric Hierarchical Clustering |

claim paper

Post Info
More Details (n/a)

Added	30 Nov 2009
Updated	30 Nov 2009
Type	Conference
Year	2002
Where	KDD
Authors	Jeremy Tantrum, Alejandro Murua, Werner Stuetzle

Comments (0)

Sciweavers

Hierarchical model-based clustering of large datasets through fractionation and refractionation

Data Mining | Hierarchical Model-based Clustering | KDD 2002 | Non-parametric Clustering Methods | Non-parametric Hierarchical Clustering |

Explore & Download

Productivity Tools

Sciweavers