Sciweavers

ICDM
2003
IEEE

Tree-structured Partitioning Based on Splitting Histograms of Distances

14 years 5 months ago
Tree-structured Partitioning Based on Splitting Histograms of Distances
We propose a novel clustering algorithm that is similar in spirit to classification trees. The data is recursively split using a criterion that applies a discrete curve evolution method to the histogram of distances. The algorithm can be depicted through tree diagrams with triple splits. Leaf nodes represent either clusters or sets of observations that can not yet be clearly assigned to a cluster. After constructing the tree, unclassified data points are mapped to their closest clusters. The algorithm has several advantages. First, it deals effectively with observations that can not be unambiguously assigned to a cluster by allowing a ”margin of error”. Second, it automatically determines the number of clusters; apart from the margin of error the user only needs to specify the minimal cluster size but not the number of clusters. Third, it is linear with respect to the number of data points and thus suitable for very large data sets. Experiments involving both simulated and real ...
Longin Jan Latecki, Rajagopal Venugopal, Marc Sobe
Added 04 Jul 2010
Updated 04 Jul 2010
Type Conference
Year 2003
Where ICDM
Authors Longin Jan Latecki, Rajagopal Venugopal, Marc Sobel, Steve Horvat
Comments (0)