Sciweavers

SDM
2009
SIAM

A New Constraint for Mining Sets in Sequences.

14 years 8 months ago
A New Constraint for Mining Sets in Sequences.
Discovering interesting patterns in event sequences is a popular task in the field of data mining. Most existing methods try to do this based on some measure of cohesion to determine an occurrence of a pattern, and a frequency threshold to determine if the pattern occurs often enough. We introduce a new constraint based on a new interestingness measure combining the cohesion and the frequency of a pattern. For a dataset consisting of a single sequence, the cohesion is measured as the average length of the smallest intervals containing the pattern for each occurrence of its events, and the frequency is measured as the probability of observing an event of that pattern. We present a similar constraint for datasets consisting of multiple sequences. We present algorithms to efficiently identify the thus defined interesting patterns, given a dataset and a user-defined threshold. After applying our method to both synthetic and real-life data, we conclude that it indeed gives intuitive res...
Bart Goethals, Boris Cule, Céline Robardet
Added 07 Mar 2010
Updated 07 Mar 2010
Type Conference
Year 2009
Where SDM
Authors Bart Goethals, Boris Cule, Céline Robardet
Comments (0)