Sciweavers

2098 search results - page 71 / 420
» Syntactic Topic Models
Sort
View
NIPS
2001
13 years 10 months ago
Latent Dirichlet Allocation
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian m...
David M. Blei, Andrew Y. Ng, Michael I. Jordan
KDD
2002
ACM
109views Data Mining» more  KDD 2002»
14 years 9 months ago
Topics in 0--1 data
Large 0-1 datasets arise in various applications, such as market basket analysis and information retrieval. We concentrate on the study of topic models, aiming at results which in...
Ella Bingham, Heikki Mannila, Jouni K. Seppän...
EMNLP
2010
13 years 6 months ago
Exploiting Conversation Structure in Unsupervised Topic Segmentation for Emails
This work concerns automatic topic segmentation of email conversations. We present a corpus of email threads manually annotated with topics, and evaluate annotator reliability. To...
Shafiq R. Joty, Giuseppe Carenini, Gabriel Murray,...
SIGIR
2002
ACM
13 years 8 months ago
A critical examination of TDT's cost function
Topic Detection and Tracking (TDT) tasks are evaluated using a cost function. The standard TDT cost function assumes a constant probability of relevance P(rel) across all topics. ...
R. Manmatha, Ao Feng, James Allan
CIKM
2010
Springer
13 years 7 months ago
Decomposing background topics from keywords by principal component pursuit
Low-dimensional topic models have been proven very useful for modeling a large corpus of documents that share a relatively small number of topics. Dimensionality reduction tools s...
Kerui Min, Zhengdong Zhang, John Wright, Yi Ma