Sciweavers

HICSS
2002
IEEE

Obtaining Language Models of Web Collections Using Query-Based Sampling Techniques

14 years 5 months ago
Obtaining Language Models of Web Collections Using Query-Based Sampling Techniques
In the context of information retrieval, traditional collection selection algorithms have been widely studied. These algorithms utilize language models, a representation of the contents of each text collection over which selection is to be performed, but these language models cannot always be easily acquired. Query-based sampling is a technique by which these language models are discovered by interacting with a collection and observing the results. Previous work has shown query-based sampling to be a viable solution to the problem of discovering the contents of text collections when the information cannot be otherwise obtained. However, the characteristics of language models of WWW collections created using query-based sampling have not yet been studied. This work evaluates two query-based sampling techniques for building language models of three World Wide Web collections. Experimental results support the effectiveness of query-based sampling as a solution for building language model...
Gary A. Monroe, James C. French, Allison L. Powell
Added 14 Jul 2010
Updated 14 Jul 2010
Type Conference
Year 2002
Where HICSS
Authors Gary A. Monroe, James C. French, Allison L. Powell
Comments (0)