Modern retrieval test collections are built through a process called pooling in which only a sample of the entire document set is judged for each topic. The idea behind pooling is...
Chris Buckley, Darrin Dimmick, Ian Soboroff, Ellen...
Web spamming techniques aim to achieve undeserved rankings in search results. Research has been widely conducted on identifying such spam and neutralizing its influence. However,...
This paper describes ongoing research into the application of machine learning techniques for improving access to governmental information in complex digital libraries. Under the ...
Miles Efron, Jonathan L. Elsas, Gary Marchionini, ...
Topic models provide a powerful tool for analyzing large text collections by representing high dimensional data in a low dimensional subspace. Fitting a topic model given a set of...
Noun phrases in queries are identified and classified into four types: proper names, dictionary phrases, simple phrases and complex phrases. A document has a phrase if all content...