We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian m...
We consider the problem of modeling annotated data—data with multiple types where the instance of one type (such as a caption) serves as a description of the other type (such as...
In a wide range of business areas dealing with text data streams, including CRM, knowledge management, and Web monitoring services, it is an important issue to discover topic tren...
The dynamic hierarchical Dirichlet process (dHDP) is developed to model the timeevolving statistical properties of sequential data sets. The data collected at any time point are r...
Enterprise corpora contain evidence of what employees work on and therefore can be used to automatically find experts on a given topic. We present a general approach for represen...