Distribution-Based Synthetic Database Generation Techniques for Itemset Mining

16 years 6 days ago

Download www.cs.rpi.edu

The resource requirements of frequent pattern mining algorithms depend mainly on the length distribution of the mined patterns in the database. Synthetic databases, which are used to benchmark performance of algorithms, tend to have distributions far different from those observed in real datasets. In this paper we focus on the problem of synthetic database generation and propose algorithms to effectively embed within the database, any given set of maximal pattern collections, and make the following contributions:

Ganesh Ramesh, Mohammed Javeed Zaki, William Mania

Real-time Traffic

Database | IDEAS 2005 | Pattern Mining Algorithms | Synthetic Database | Synthetic Database Generation |

claim paper

» Approximate Inverse Frequent Itemset Mining Privacy Complexity and Approximation

» UFIMT an uncertain frequent itemset mining toolbox

» Feasible itemset distributions in data mining theory and application

» Generating nonredundant association rules

» Mining Frequent Itemsets in Distributed and Dynamic Databases

» An Efficient Algorithm to Update Large Itemsets with Early Pruning

» Frequent Itemsets Mining for Database AutoAdministration

» Cartesian contour a concise representation for a collection of frequent sets

Post Info
More Details (n/a)

Added	25 Jun 2010
Updated	25 Jun 2010
Type	Conference
Year	2005
Where	IDEAS
Authors	Ganesh Ramesh, Mohammed Javeed Zaki, William Maniatty

Comments (0)

Sciweavers

Distribution-Based Synthetic Database Generation Techniques for Itemset Mining

Database | IDEAS 2005 | Pattern Mining Algorithms | Synthetic Database | Synthetic Database Generation |

Explore & Download

Productivity Tools

Sciweavers