Sciweavers

PAKDD
2007
ACM

A Fast Algorithm for Finding Correlation Clusters in Noise Data

14 years 6 months ago
A Fast Algorithm for Finding Correlation Clusters in Noise Data
Abstract. Noise significantly affects cluster quality. Conventional clustering methods hardly detect clusters in a data set containing a large amount of noise. Projected clustering sheds light on identifying correlation clusters in such a data set. In order to exclude noise points which are usually scattered in a subspace, data points are projected to form dense areas in the subspace that are regarded as correlation clusters. However, we found that the existing methods for the projected clustering did not work very well with noise data, since they employ randomly generated seeds (micro clusters) to trade-off the clustering quality. In this paper, we propose a divisive method for the projected clustering that does not rely on random seeds. The proposed algorithm is capable of producing higher quality correlation clusters from noise data in a more efficient way than an agglomeration projected algorithm. We experimentally show that our algorithm captures correlation clusters in noise da...
Jiuyong Li, Xiaodi Huang, Clinton Selke, Jianming
Added 09 Jun 2010
Updated 09 Jun 2010
Type Conference
Year 2007
Where PAKDD
Authors Jiuyong Li, Xiaodi Huang, Clinton Selke, Jianming Yong
Comments (0)