Sciweavers

1319 search results - page 28 / 264
» Using the Structure of HTML Documents to Improve Retrieval
Sort
View
CIKM
2007
Springer
14 years 1 months ago
Effective top-k computation in retrieving structured documents with term-proximity support
Modern web search engines are expected to return top-k results efficiently given a query. Although many dynamic index pruning strategies have been proposed for efficient top-k com...
Mingjie Zhu, Shuming Shi, Mingjing Li, Ji-Rong Wen
PVLDB
2008
141views more  PVLDB 2008»
13 years 6 months ago
WebTables: exploring the power of tables on the web
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
DAS
2010
Springer
13 years 11 months ago
A kernel-based approach to document retrieval
In this paper we tackle the problem of document image retrieval by combining a similarity measure between documents and the probability that a given document belongs to a certain ...
Albert Gordo, Jaume Gibert, Ernest Valveny, Mar&cc...
ICDAR
2009
IEEE
14 years 2 months ago
User-Guided Wrapping of PDF Documents Using Graph Matching Techniques
There are a number of established products on the market for wrapping—semi-automatic navigation and extraction of data—from web pages. These solutions make use of the inherent...
Tamir Hassan
SIGIR
2012
ACM
11 years 9 months ago
Improving searcher models using mouse cursor activity
Web search components such as ranking and query suggestions analyze the user data provided in query and click logs. While this data is easy to collect and provides information abo...
Jeff Huang, Ryen W. White, Georg Buscher, Kuansan ...