Sciweavers

1319 search results - page 14 / 264
» Using the Structure of HTML Documents to Improve Retrieval
Sort
View
JCDL
2006
ACM
167views Education» more  JCDL 2006»
14 years 1 months ago
Combining DOM tree and geometric layout analysis for online medical journal article segmentation
We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...
Jie Zou, Daniel X. Le, George R. Thoma
SIGIR
2004
ACM
14 years 24 days ago
Usefulness of hyperlink structure for query-biased topic distillation
In this paper, we introduce an information theoretic method for estimating the usefulness of the hyperlink structure induced from the set of retrieved documents. We evaluate the e...
Vassilis Plachouras, Iadh Ounis
SIGMOD
2011
ACM
219views Database» more  SIGMOD 2011»
12 years 10 months ago
Context-sensitive ranking for document retrieval
We study the problem of context-sensitive ranking for document retrieval, where a context is defined as a sub-collection of documents, and is specified by queries provided by do...
Liang Jeff Chen, Yannis Papakonstantinou
INEX
2004
Springer
14 years 22 days ago
The Utrecht Blend: Basic Ingredients for an XML Retrieval System
Exploiting the structure of a document allows for more powerful information retrieval techniques. In this article a basic approach is discussed for the retrieval of XML document f...
Roelof van Zwol, Frans Wiering, Virginia Dignum
IJCAI
1997
13 years 8 months ago
Toward Structured Retrieval in Semi-structured Information Spaces
A semi-structured information space consists of multiple collections of textual documents containing fielded or tagged sections. The space can be highly heterogeneous, because eac...
Scott B. Huffman, Catherine Baudin