Although each iteration of the popular kMeans clustering heuristic scales well to larger problem sizes, it often requires an unacceptably-high number of iterations to converge to ...
Abstract. We consider the problem of finding communities in large linked networks such as web structures or citation networks. We review similarity measures for linked objects and...
We consider the problem of learning mixtures of distributions via spectral methods and derive a tight characterization of when such methods are useful. Specifically, given a mixt...
This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k-Means, SVM, Apriori, EM, Page...
Xindong Wu, Vipin Kumar, J. Ross Quinlan, Joydeep ...
Abstract. We consider a statistical database in which a trusted administrator introduces noise to the query responses with the goal of maintaining privacy of individual database en...