Clustering Pairwise Distances with Missing Data: Maximum Cuts Versus Normalized Cuts

15 years 5 months ago

Download www-alg.ist.hokudai.ac.jp

Abstract. Clustering algorithms based on a matrix of pairwise similarities (kernel matrix) for the data are widely known and used, a particularly popular class being spectral clustering algorithms. In contrast, algorithms working with the pairwise distance matrix have been studied rarely for clustering. This is surprising, as in many applications, distances are directly given, and computing similarities involves another step that is error-prone, since the kernel has to be chosen appropriately, albeit computationally cheap. This paper proposes a clustering algorithm based on the SDP relaxation of the max-k-cut of the graph of pairwise distances, based on the work of Frieze and Jerrum. We compare the algorithm with Yu and Shi's algorithm based on spectral relaxation of a norm-k-cut. Moreover, we propose a simple heuristic for dealing with missing data, i.e., the case where some of the pairwise distances or similarities are not known. We evaluate the algorithms on the task of cluster...

Jan Poland, Thomas Zeugmann

Real-time Traffic

Clustering Algorithm | DIS 2006 | Pairwise Distance Matrix | Pairwise Distances | Theoretical Computer Science |

claim paper

Post Info
More Details (n/a)

Added	22 Aug 2010
Updated	22 Aug 2010
Type	Conference
Year	2006
Where	DIS
Authors	Jan Poland, Thomas Zeugmann

Comments (0)

Sciweavers

Clustering Pairwise Distances with Missing Data: Maximum Cuts Versus Normalized Cuts

Clustering Algorithm | DIS 2006 | Pairwise Distance Matrix | Pairwise Distances | Theoretical Computer Science |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers