Hierarchical probabilistic modeling of discrete data has emerged as a powerful tool for text analysis. Posterior inference in such models is intractable, and practitioners rely on...
Bayesian text classifiers face a common issue which is referred to as data sparsity problem, especially when the size of training data is very small. The frequently used Laplacian...
We investigate the problem of learning a widely-used latent-variable model – the Latent Dirichlet Allocation (LDA) or “topic” model – using distributed computation, where ...
David Newman, Arthur Asuncion, Padhraic Smyth, Max...
Many tasks in speech processing involve classification of long term characteristics of a speech segment such as language, speaker, dialect, or topic. A natural technique for dete...
This paper introduced the four tracks that WIM-Lab Fudan University had taken part in at TREC 2007. For spam track, a multi-centre model was proposed considering the characteristi...
Jun Xu, Jing Yao, Jiaqian Zheng, Qi Sun, Junyu Niu