Interval data is attracting attention from the data analysis community due to its ability to describe complex concepts. Since clustering is an important data analysis tool, extending these techniques to interval data is important. Applying traditional clustering methods on interval data loses information inherited in this particular data type. This paper proposes a novel dissimilarity measure which explores the internal structure of intervals in a probabilistic manner based on domain knowledge. Our experiments show that interval clustering based on the proposed dissimilarity measure produces meaningful results.
Jie Ouyang, Ishwar K. Sethi