The purpose of extractive summarization is to automatically select a number of indicative sentences, passages, or paragraphs from the original document according to a target summa...
In this paper we propose an extension of the PLSA model in which an extra latent variable allows the model to cocluster documents and terms simultaneously. We show on three datase...
We introduce a generative probabilistic document model based on latent Dirichlet allocation (LDA), to deal with textual errors in the document collection. Our model is inspired by...
We propose a visualization method based on a topic model for discrete data such as documents. Unlike conventional visualization methods based on pairwise distances such as multi-d...
A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (&q...