Convex Clustering with Exemplar-Based Models

15 years 8 months ago

Download people.csail.mit.edu

Clustering is often formulated as the maximum likelihood estimation of a mixture model that explains the data. The EM algorithm widely used to solve the resulting optimization problem is inherently a gradient-descent method and is sensitive to initialization. The resulting solution is a local optimum in the neighborhood of the initial guess. This sensitivity to initialization presents a signiﬁcant challenge in clustering large data sets into many clusters. In this paper, we present a different approach to approximate mixture ﬁtting for clustering. We introduce an exemplar-based likelihood function that approximates the exact likelihood. This formulation leads to a convex minimization problem and an efﬁcient algorithm with guaranteed convergence to the globally optimal solution. The resulting clustering can be thought of as a probabilistic mapping of the data points to the set of exemplars that minimizes the average distance and the information-theoretic cost of mapping. We prese...

Danial Lashkari, Polina Golland

Real-time Traffic