Parallel Leap: Large-Scale Maximal Pattern Mining in a Distributed Environment

16 years 17 days ago

Download webdocs.cs.ualberta.ca

When computationally feasible, mining extremely large databases produces tremendously large numbers of frequent patterns. In many cases, it is impractical to mine those datasets due to their sheer size; not only the extent of the existing patterns, but mainly the magnitude of the search space. Many approaches have been suggested such as sequential mining for maximal patterns or searching for all frequent patterns in parallel. So far, those approaches are still not genuinely effective to mine extremely large datasets. In this work we propose a method that combines both strategies efﬁciently, i.e. mining in parallel for the set of maximal patterns which, to the best of our knowledge, has never been proposed efﬁciently before. Using this approach we could mine signiﬁcantly large datasets; with sizes never reported in the literature before. We are able to effectively discover frequent patterns in a database made of billion transactions using a 32 processors cluster in less than 2 ho...

Mohammad El-Hajj, Osmar R. Zaïane

Real-time Traffic

Frequent Patterns | ICPADS 2006 | Large Datasets | Maximal Patterns |

claim paper

Post Info
More Details (n/a)

Added	11 Jun 2010
Updated	11 Jun 2010
Type	Conference
Year	2006
Where	ICPADS
Authors	Mohammad El-Hajj, Osmar R. Zaïane

Comments (0)

Sciweavers

Parallel Leap: Large-Scale Maximal Pattern Mining in a Distributed Environment

Frequent Patterns | ICPADS 2006 | Large Datasets | Maximal Patterns |

Explore & Download

Productivity Tools

Sciweavers