We present a simple and scalable graph clustering method called power iteration clustering (PIC). PIC finds a very low-dimensional embedding of a dataset using truncated power ite...
Clustering is an important data mining problem. Most of the earlier work on clustering focussed on numeric attributes which have a natural ordering on their attribute values. Rece...
Venkatesh Ganti, Johannes Gehrke, Raghu Ramakrishn...
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
Recently there has been an increasing interest in developing regression models for large datasets that are both accurate and easy to interpret. Regressors that have these properti...
Instant intercommunion techniques such as Instant Messaging (IM) are widely popularized. Aiming at such kind of large scale masscommunication media, clustering on its text conte...