—To date, most studies on spam have focused only on the spamming phase of the spam cycle and have ignored the harvesting phase, which consists of the mass acquisition of email ad...
Kevin S. Xu, Mark Kliger, Yilun Chen, Peter J. Woo...
Many problems in machine learning and statistics can be formulated as (generalized) eigenproblems. In terms of the associated optimization problem, computing linear eigenvectors a...
We describe an algorithm for clustering using a similarity graph. The algorithm (a) runs in O(n log3 n + m log n) time on graphs with n vertices and m edges, and (b) with high pro...
Abstract— Analyzing unknown data sets such as multispectral images often requires unsupervised techniques. Data clustering is a well known and widely used approach in such cases....
The popular K-means clustering partitions a data set by minimizing a sum-of-squares cost function. A coordinate descend method is then used to nd local minima. In this paper we sh...
Hongyuan Zha, Xiaofeng He, Chris H. Q. Ding, Ming ...