We consider the problem of joining massive datasets. We propose two techniques for minimizing disk I/O cost of join operations for both spatial and sequence data. Our techniques o...
We present an algorithmic scheme for unsupervised cluster ensembles, based on randomized projections between metric spaces, by which a substantial dimensionality reduction is obtai...
: A new approach for topographic mapping, called Swarm-Organized Projection (SOP) is presented. SOP has been inspired by swarm intelligence methods for clustering and is similar to...
Summary: We present a new R package for the assessment of the reliability of clusters discovered in high dimensional DNA microarray data. The package implements methods based on r...
Clustering is one of the most widely used statistical tools for data analysis. Among all existing clustering techniques, k-means is a very popular method because of its ease of pr...