Estimating the number of segments in time series data using permutation tests

15 years 11 months ago

Download www.nokia.com

Segmentation is a popular technique for discovering structure in time series data. We address the largely open problem of estimating the number of segments that can be reliably discovered. We introduce a novel method for the problem, called Pete. Pete is based on permutation testing. The problem is an instance of model (dimension) selection. The proposed method analyzes the possible overﬁt of a model to the available data rather than uses a term for penalizing model complexity. In this respect the approach is more similar to cross-validation than regularization based techniques (e.g., AIC, BIC, MDL, MML). Further, the method produces a ¤ value for each increase in the number of segments. This gives the user an overview of the statistical signiﬁcance of the segmentations. We evaluate the performance of the proposed method using both synthetic and real time series data. The experiments show that permutation testing gives realistic results about the number of reliably identiﬁable ...

Kari Vasko, Hannu Toivonen

Real-time Traffic