Sciweavers

56 search results - page 4 / 12
» An Improved Hierarchical Bayesian Model of Language for Docu...
Sort
View
NIPS
2001
13 years 9 months ago
Latent Dirichlet Allocation
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian m...
David M. Blei, Andrew Y. Ng, Michael I. Jordan
DATASCIENCE
2007
88views more  DATASCIENCE 2007»
13 years 7 months ago
Detecting Family Resemblance: Automated Genre Classification
This paper presents results in automated genre classification of digital documents in PDF format. It describes genre classification as an important ingredient in contextualising s...
Yunhyong Kim, Seamus Ross
CIKM
2008
Springer
13 years 10 months ago
Active relevance feedback for difficult queries
Relevance feedback has been demonstrated to be an effective strategy for improving retrieval accuracy. The existing relevance feedback algorithms based on language models and vect...
Zuobing Xu, Ram Akella
ICML
2005
IEEE
14 years 8 months ago
A model for handling approximate, noisy or incomplete labeling in text classification
We introduce a Bayesian model, BayesANIL, that is capable of estimating uncertainties associated with the labeling process. Given a labeled or partially labeled training corpus of...
Ganesh Ramakrishnan, Krishna Prasad Chitrapura, Ra...
KDD
2002
ACM
170views Data Mining» more  KDD 2002»
14 years 8 months ago
Enhanced word clustering for hierarchical text classification
In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...