Sciweavers

KDD
2003
ACM

Assessment and pruning of hierarchical model based clustering

15 years 1 months ago
Assessment and pruning of hierarchical model based clustering
The goal of clustering is to identify distinct groups in a dataset. The basic idea of model-based clustering is to approximate the data density by a mixture model, typically a mixture of Gaussians, and to estimate the parameters of the component densities, the mixing fractions, and the number of components from the data. The number of distinct groups in the data is then taken to be the number of mixture components, and the observations are partitioned into clusters (estimates of the groups) using Bayes' rule. If the groups are well separated and look Gaussian, then the resulting clusters will indeed tend to be "distinct" in the most common sense of the word - contiguous, densely populated areas of feature space, separated by contiguous, relatively empty regions. If the groups are not Gaussian, however, this correspondence may break down; an isolated group with a non-elliptical distribution, for example, may be modeled by not one, but several mixture components, and the ...
Jeremy Tantrum, Alejandro Murua, Werner Stuetzle
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2003
Where KDD
Authors Jeremy Tantrum, Alejandro Murua, Werner Stuetzle
Comments (0)