Semantic analysis of a document collection can be viewed as an unsupervised clustering of the constituent words and documents around hidden or latent concepts. This has shown to i...
Analyzing the author and topic relations in email corpus is an important issue in both social network analysis and text mining. The AuthorTopic model is a statistical model that id...
Nonnegative matrix tri-factorization (NMTF) is a 3-factor decomposition of a nonnegative data matrix, X USV , where factor matrices, U, S, and V , are restricted to be nonnegativ...
This paper suggests an alternative solution for the task of spoken document retrieval (SDR). The proposed system runs retrieval on multi-level transcriptions (word and phone) prod...
Shan Jin, Hemant Misra, Thomas Sikora, Joemon M. J...
Document collections evolve over time, new topics emerge and old ones decline. At the same time, the terminology evolves as well. Much literature is devoted to topic evolution in ...