This paper proposes a new clustering algorithm in the fuzzy-c-means family, which is designed to cluster time series and is particularly suited for short time series and those with unevenly spaced sampling points. Short time series, which do not allow a conventional statistical model, and unevenly sampled time series appear in many practical situations. The algorithm developed here is motivated by experiments in biology. Conventional clustering algorithms based on the Euclidean distance or the Pearson correlation coefficient, such as hard k-means or hierarchical clustering are not able to include the temporal information in the distance measurement. Uneven sampling commonly occurs in biological experiments. The temporal order of the data is important and the varying length of sampling intervals should be considered in clustering time series. The proposed short time series (STS) distance is able to measure similarity of shapes which are formed by the relative change of amplitude and th...
Carla S. Möller-Levet, Frank Klawonn, Kwang-H