Sciweavers

SDM
2003
SIAM

STAMP: On Discovery of Statistically Important Pattern Repeats in Long Sequential Data

14 years 1 months ago
STAMP: On Discovery of Statistically Important Pattern Repeats in Long Sequential Data
In this paper, we focus on mining periodic patterns allowing some degree of imperfection in the form of random replacement from a perfect periodic pattern. In InfoMiner+, we proposed a new metric, namely generalized information gain, to identify patterns with events of vastly different occurrence frequencies and to adjust for the deviation from a pattern. In particular, a penalty is allowed to be associated with gaps between pattern occurrences. This is particularly useful in locating repeats in DNA sequences. In this paper, we present an effective mining algorithm, STAMP, to simultaneously mine significant patterns and the associated subsequences under the model of generalized information gain.
Jiong Yang, Wei Wang 0010, Philip S. Yu
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2003
Where SDM
Authors Jiong Yang, Wei Wang 0010, Philip S. Yu
Comments (0)