The problem of locating motifs in real-valued, multivariate time series data involves the discovery of sets of recurring patterns embedded in the time series. Each set is composed of several non-overlapping subsequences and constitutes a motif because all of the included subsequences are similar. The ability to automatically discover such motifs allows intelligent systems to form endogenously meaningful representations of their environment through unsupervised sensor analysis. In this paper, we formulate a unifying view of motif discovery as a problem of locating regions of high density in the space of all time series subsequences. Our approach is efficient (sub-quadratic in the length of the data), requires fewer user-specified parameters than previous methods, and naturally allows variable length motif occurrences and nonlinear temporal warping. We evaluate the performance of our approach using four data sets from different domains including on-body inertial sensors and speech.
David Minnen, Charles Lee Isbell Jr., Irfan A. Ess