Succinct summarization of transactional databases: an overlapped hyperrectangle scheme

16 years 7 months ago

Download www.cs.kent.edu

Transactional data are ubiquitous. Several methods, including frequent itemsets mining and co-clustering, have been proposed to analyze transactional databases. In this work, we propose a new research problem to succinctly summarize transactional databases. Solving this problem requires linking the high level structure of the database to a potentially huge number of frequent itemsets. We formulate this problem as a set covering problem using overlapped hyperrectangles; we then prove that this problem and its several variations are NP-hard. We develop an approximation algorithm HY PER which can achieve a ln(k) + 1 approximation ratio in polynomial time. We propose a pruning strategy that can significantly speed up the processing of our algorithm. Additionally, we propose an efficient algorithm to further summarize the set of hyperrectangles by allowing false positive conditions. A detailed study using both real and synthetic datasets shows the effectiveness and efficiency of our approa...

Yang Xiang, Ruoming Jin, David Fuhry, Feodor F. Dr

Real-time Traffic

Approximation Algorithm Hy | Data Mining | KDD 2008 | Set Covering Problem | Transactional Databases |

claim paper

Post Info
More Details (n/a)

Added	30 Nov 2009
Updated	30 Nov 2009
Type	Conference
Year	2008
Where	KDD
Authors	Yang Xiang, Ruoming Jin, David Fuhry, Feodor F. Dragan

Comments (0)

Sciweavers

Succinct summarization of transactional databases: an overlapped hyperrectangle scheme

Approximation Algorithm Hy | Data Mining | KDD 2008 | Set Covering Problem | Transactional Databases |

Explore & Download

Productivity Tools

Sciweavers