One fundamental task in near-neighbor search as well as other similarity matching efforts is to find a distance function that can efficiently quantify the similarity between two o...
The computation of covariance and correlation matrices are critical to many data mining applications and processes. Unfortunately the classical covariance and correlation matrices...
James Chilson, Raymond T. Ng, Alan Wagner, Ruben H...
With the increased abilities for automated data collection made possible by modern technology, the typical sizes of data collections have continued to grow in recent years. In suc...
Clustering is a data mining problem which finds dense regions in a sparse multi-dimensional data set. The attribute values and ranges of these regions characterize the clusters. ...
Background: Clustering is a popular data exploration technique widely used in microarray data analysis. Most conventional clustering algorithms, however, generate only one set of ...