This paper deals with the quantization problem of a random variable X taking values in a separable and reexive Banach space, and with the related question of clustering independ...
The scope of the well-known k-means algorithm has been
broadly extended with some recent results: first, the k-
means++ initialization method gives some approximation
guarantees...
A wide variety of distortion functions, such as squared Euclidean distance, Mahalanobis distance, Itakura-Saito distance and relative entropy, have been used for clustering. In th...
Arindam Banerjee, Srujana Merugu, Inderjit S. Dhil...
Presentation of the exponential families, of the mixtures of such distributions and how to learn it. We then present algorithms to simplify mixture model, using Kullback-Leibler di...
The k-means algorithm is the method of choice for clustering large-scale data sets and it performs exceedingly well in practice. Most of the theoretical work is restricted to the c...