Sciweavers

142 search results - page 15 / 29
» Entropy-Based Authorship Search in Large Document Collection...
Sort
View
AND
2009
13 years 5 months ago
Digital weight watching: reconstruction of scanned documents
A web-portal providing access to over 250.000 scanned and OCRed cultural heritage documents is analyzed. The collection consists of the complete Dutch Hansard from 1917 to 1995. E...
Tim Gielissen, Maarten Marx
SIGIR
2008
ACM
13 years 6 months ago
ResIn: a combination of results caching and index pruning for high-performance web search engines
Results caching is an efficient technique for reducing the query processing load, hence it is commonly used in real search engines. This technique, however, bounds the maximum hit...
Gleb Skobeltsyn, Flavio Junqueira, Vassilis Placho...
ICDE
2008
IEEE
241views Database» more  ICDE 2008»
14 years 9 months ago
PictureBook: A Text-and-Image Summary System for Web Search Result
Search engine technology plays an important role in Web information retrieval. However, with Internet information explosion, traditional searching techniques cannot provide satisfa...
Baile Shi, Guoyu Hao, Hongtao Xu, Mei Wang, Qi Zha...
TJS
2008
105views more  TJS 2008»
13 years 7 months ago
Using a relational database for scalable XML search
XML is a flexible and powerful tool that enables information and security sharing in heterogeneous environments. Scalable technologies are needed to effectively manage the growing...
Rebecca Cathey, Steven M. Beitzel, Eric C. Jensen,...
CPM
2000
Springer
177views Combinatorics» more  CPM 2000»
13 years 12 months ago
Identifying and Filtering Near-Duplicate Documents
Abstract. The mathematical concept of document resemblance captures well the informal notion of syntactic similarity. The resemblance can be estimated using a fixed size “sketch...
Andrei Z. Broder