In this paper we propose an extension of the PLSA model in which an extra latent variable allows the model to cocluster documents and terms simultaneously. We show on three datase...
We introduce the posterior probabilistic clustering (PPC), which provides a rigorous posterior probability interpretation for Nonnegative Matrix Factorization (NMF) and removes th...
A novel method for simultaneous keyphrase extraction and generic text summarization is proposed by modeling text documents as weighted undirected and weighted bipartite graphs. Sp...
Web spam can significantly deteriorate the quality of search engines. Early web spamming techniques mainly manipulate page content. Since linkage information is widely used in we...
Data streams are often locally correlated, with a subset of streams exhibiting coherent patterns over a subset of time points. Subspace clustering can discover clusters of objects...