Sciweavers

KDD
1998
ACM

Probabilistic Modeling for Information Retrieval with Unsupervised Training Data

14 years 3 months ago
Probabilistic Modeling for Information Retrieval with Unsupervised Training Data
We apply a well-known Bayesian probabilistic model to textual information retrieval: the classification of documents based on their relevance to a query. This model was previously used with supervised training data for a fixed query. When only noisy, unsupervised training data generated from a heuristic relevance-scoring formula are available, two crucial adaptations are needed: (1) severe smoothing of the models built on the training data; and (2) adding a prior probability to the models. We have shown that with these adaptations, the probabilistic model is able to improve the retrieval precision of the heuristic model. The experiment was performed using the TREC-5 corpus and queries, and the evaluation of the model was submitted as an official entry (ibms96b) to TREC-5.
Ernest P. Chan, Santiago Garcia, Salim Roukos
Added 06 Aug 2010
Updated 06 Aug 2010
Type Conference
Year 1998
Where KDD
Authors Ernest P. Chan, Santiago Garcia, Salim Roukos
Comments (0)