Theproblemof efficiently and accurately locating patterns of interest in massivetimeseries data sets is an important and non-trivial problemin a wide variety of applications, including diagnosis and monitoringof complexsystems, biomedicMdata analysis, and exploratory data analysis in scientific and businesstime series. In this paper a probabflistic approachis taken to this problem.Usingpiecewise linear segmentations as the underlyingrepresentation, local features (such as peaks, troughs, and plateaus) are defined using prior distribution on expecteddeformationsfroma basic template. Global shapeinformation is represented using another prior on the relative locations of the individual features. Anappropriately defined probabilistic modelintegrates the local and global informationand directly leads to an overall distance measure betweensequencepatterns based on prior knowledge. Asearch algorithm using this distance measure is shownto efficiently and accurately find matchesfor a variety o...
Eamonn J. Keogh, Padhraic Smyth