Sciweavers

ICDM
2006
IEEE

TOP-COP: Mining TOP-K Strongly Correlated Pairs in Large Databases

14 years 6 months ago
TOP-COP: Mining TOP-K Strongly Correlated Pairs in Large Databases
Recently, there has been considerable interest in computing strongly correlated pairs in large databases. Most previous studies require the specification of a minimum correlation threshold to perform the computation. However, it may be difficult for users to provide an appropriate threshold in practice, since different data sets typically have different characteristics. To this end, we propose an alternative task: mining the top-k strongly correlated pairs. In this paper, we identify a 2-D monotone property of an upper bound of Pearson’s correlation coefficient and develop an efficient algorithm, called TOP-COP to exploit this property to effectively prune many pairs even without computing their correlation coefficients. Our experimental results show that the TOP-COP algorithm can be orders of magnitude faster than brute-force alternatives for mining the top-k strongly correlated pairs.
Hui Xiong, Mark Brodie, Sheng Ma
Added 11 Jun 2010
Updated 11 Jun 2010
Type Conference
Year 2006
Where ICDM
Authors Hui Xiong, Mark Brodie, Sheng Ma
Comments (0)