Sciweavers

KDD
2008
ACM

Succinct summarization of transactional databases: an overlapped hyperrectangle scheme

15 years 1 months ago
Succinct summarization of transactional databases: an overlapped hyperrectangle scheme
Transactional data are ubiquitous. Several methods, including frequent itemsets mining and co-clustering, have been proposed to analyze transactional databases. In this work, we propose a new research problem to succinctly summarize transactional databases. Solving this problem requires linking the high level structure of the database to a potentially huge number of frequent itemsets. We formulate this problem as a set covering problem using overlapped hyperrectangles; we then prove that this problem and its several variations are NP-hard. We develop an approximation algorithm HY PER which can achieve a ln(k) + 1 approximation ratio in polynomial time. We propose a pruning strategy that can significantly speed up the processing of our algorithm. Additionally, we propose an efficient algorithm to further summarize the set of hyperrectangles by allowing false positive conditions. A detailed study using both real and synthetic datasets shows the effectiveness and efficiency of our approa...
Yang Xiang, Ruoming Jin, David Fuhry, Feodor F. Dr
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2008
Where KDD
Authors Yang Xiang, Ruoming Jin, David Fuhry, Feodor F. Dragan
Comments (0)