Speeding-Up Hierarchical Agglomerative Clustering in Presence of Expensive Metrics

14 years 10 months ago

Download ercolino.isti.cnr.it

In several contexts and domains, hierarchical agglomerative clustering (HAC) oﬀers best-quality results, but at the price of a high complexity which reduces the size of datasets which can be handled. In some contexts, in particular, computing distances between objects is the most expensive task. In this paper we propose a pruning heuristics aimed at improving performances in these cases, which is well integrated in all the phases of the HAC process and can be applied to two HAC variants: single-linkage and complete-linkage. After describing the method, we provide some theoretical evidence of its pruning power, followed by an empirical study of its eﬀectiveness over diﬀerent data domains, with a special focus on dimensionality issues.

Mirco Nanni

Real-time Traffic

Data Mining | HAC Process | HAC Variants | Hierarchical Agglomerative Clustering | PAKDD 2005 |

claim paper

Post Info
More Details (n/a)

Added	28 Jun 2010
Updated	28 Jun 2010
Type	Conference
Year	2005
Where	PAKDD
Authors	Mirco Nanni

Comments (0)

Sciweavers

Speeding-Up Hierarchical Agglomerative Clustering in Presence of Expensive Metrics

Data Mining | HAC Process | HAC Variants | Hierarchical Agglomerative Clustering | PAKDD 2005 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers