Sciweavers

685 search results - page 5 / 137
» Achieving Transparent Integration of Information, Documents ...
Sort
View
SIGIR
1995
ACM
14 years 3 days ago
Integrating IR and RDBMS Using Cooperative Indexing
The full integration of information retrieval (IR) features into a database management system (DBMS) has long been recognized as both a significant goal and a challenging undertak...
Samuel DeFazio, Amjad M. Daoud, Lisa Ann Smith, Ja...
SIGIR
2011
ACM
12 years 11 months ago
Faster top-k document retrieval using block-max indexes
Large search engines process thousands of queries per second over billions of documents, making query processing a major performance bottleneck. An important class of optimization...
Shuai Ding, Torsten Suel
CIKM
2008
Springer
13 years 10 months ago
Achieving both high precision and high recall in near-duplicate detection
To find near-duplicate documents, fingerprint-based paradigms such as Broder's shingling and Charikar's simhash algorithms have been recognized as effective approaches a...
Lian'en Huang, Lei Wang, Xiaoming Li
DAS
2004
Springer
14 years 1 months ago
An Integrated Approach for Automatic Semantic Structure Extraction in Document Images
In this paper we present an integrated approach for semantic structure extraction in document images. Document images are initially processed to extract both their layout and logic...
Margherita Berardi, Michele Lapi, Donato Malerba
DEXAW
1999
IEEE
95views Database» more  DEXAW 1999»
14 years 27 days ago
An XML-Based, 3-Tier Scheme for Integrating Heterogeneous Information Sources to the WWW
The phenomenal growth that the WWW currently experiences necessitates the integration of various types of information sources to its platform. We present an open, extensible multi...
Costas Petrou, Stathes Hadjiefthymiades, Drakoulis...