In this work, we introduce the new problem of finding time series discords. Time series discords are subsequences of a longer time series that are maximally different to all the r...
In the realm of data mining, several key issues exists in the traditional classification algorithms, such as low readability, large rule number, and low accuracy with information ...
In this paper, we investigate how to scale hierarchical clustering methods (such as OPTICS) to extremely large databases by utilizing data compression methods (such as BIRCH or ra...
Markus M. Breunig, Hans-Peter Kriegel, Peer Kr&oum...
Background: Data mining in large DNA sequences is a major challenge in microbial genomics and bioinformatics. Oligonucleotide usage (OU) patterns provide a wealth of information f...
Hamilton Ganesan, Anna S. Rakitianskaia, Colin F. ...
The goal of clustering is to identify distinct groups in a dataset. The basic idea of model-based clustering is to approximate the data density by a mixture model, typically a mix...