Most datasets in real applications come in from multiple sources. As a result, we often have attributes information about data objects and various pairwise relations (similarity) ...
Clustering is one of the most widely used statistical tools for data analysis. Among all existing clustering techniques, k-means is a very popular method because of its ease of pr...
In this paper, we design recommender systems for weblogs based on the link structure among them. We propose algorithms based on refined random walks and spectral methods. First, w...
We consider the problem of document indexing and representation. Recently, Locality Preserving Indexing (LPI) was proposed for learning a compact document subspace. Different from...
Deng Cai, Xiaofei He, Wei Vivian Zhang, Jiawei Han
Data clustering is an important task in many disciplines. A large number of studies have attempted to improve clustering by using the side information that is often encoded as pai...