This paper presents a simple new algorithm that performs k-means clustering in one scan of a dataset, while using a bu er for points from the dataset of xed size. Experiments show...
Given a dataset P, a k-means query returns k points in space (called centers), such that the average squared distance between each point in P and its nearest center is minimized. S...
Zhenjie Zhang, Yin Yang, Anthony K. H. Tung, Dimit...
We present polynomial upper and lower bounds on the number of iterations performed by the k-means method (a.k.a. Lloyd's method) for k-means clustering. Our upper bounds are ...
: Document clustering has been used for better document retrieval and text mining. In this paper, we investigate if a biomedical ontology improves biomedical literature clustering ...
The k-means algorithm is widely used for clustering because of its computational efficiency. Given n points in d-dimensional space and the number of desired clusters k, k-means see...
Clustering in gene expression data sets is a challenging problem. Different algorithms for clustering of genes have been proposed. However due to the large number of genes only a ...
The performance of K-means and Gaussian mixture model (GMM) clustering depends on the initial guess of partitions. Typically, clus∗ corresponding author 1
Clustering has been one of the most widely studied topics in data mining and k-means clustering has been one of the popular clustering algorithms. K-means requires several passes ...
The k-means method is a widely used clustering algorithm. One of its distinguished features is its speed in practice. Its worst-case running-time, however, is exponential, leaving...
Background: DNA microarrays, which determine the expression levels of tens of thousands of genes from a sample, are an important research tool. However, the volume of data they pr...
Alexander L. Richards, Peter Holmans, Michael C. O...