Sciweavers

142 search results - page 12 / 29
» Entropy-Based Authorship Search in Large Document Collection...
Sort
View
SIGIR
2005
ACM
14 years 1 months ago
Controlling overlap in content-oriented XML retrieval
The direct application of standard ranking techniques to retrieve individual elements from a collection of XML documents often produces a result set in which the top ranks are dom...
Charles L. A. Clarke
CIKM
2009
Springer
13 years 11 months ago
Classification-based resource selection
In some retrieval situations, a system must search across multiple collections. This task, referred to as federated search, occurs for example when searching a distributed index o...
Jaime Arguello, Jamie Callan, Fernando Diaz
IPM
2007
95views more  IPM 2007»
13 years 7 months ago
Using structural contexts to compress semistructured text collections
We describe a compression model for semistructured documents, called Structural Contexts Model (SCM), which takes advantage of the context information usually implicit in the stru...
Joaquín Adiego, Gonzalo Navarro, Pablo de l...
SIGIR
2009
ACM
14 years 2 months ago
Brute force and indexed approaches to pairwise document similarity comparisons with MapReduce
This paper explores the problem of computing pairwise similarity on document collections, focusing on the application of “more like this” queries in the life sciences domain. ...
Jimmy J. Lin
CIKM
2000
Springer
13 years 12 months ago
The Webspace Method: On the Integration of Database Technology with Multimedia Retrieval
Large collections of documents containing various types of multimedia, are made available to the WWW. Unfortunately, due to the un-structuredness of Internet environments it is ha...
Roelof van Zwol, Peter M. G. Apers