Sciweavers

1319 search results - page 13 / 264
» Using the Structure of HTML Documents to Improve Retrieval
Sort
View
SIGMOD
2009
ACM
140views Database» more  SIGMOD 2009»
14 years 2 months ago
Robust web extraction: an approach based on a probabilistic tree-edit model
On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to effectively extract information of interest. Of course, the scripts and thus ...
Nilesh N. Dalvi, Philip Bohannon, Fei Sha
APWEB
2003
Springer
14 years 18 days ago
Extracting Content Structure for Web Pages Based on Visual Representation
Abstract. A new web content structure based on visual representation is proposed in this paper. Many web applications such as information retrieval, information extraction and auto...
Deng Cai, Shipeng Yu, Ji-Rong Wen, Wei-Ying Ma
WWW
2002
ACM
14 years 8 months ago
Using web structure for classifying and describing web pages
The structure of the web is increasingly being used to improve organization, search, and analysis of information on the web. For example, Google uses the text in citing documents ...
Eric J. Glover, Kostas Tsioutsiouliklis, Steve Law...
IPM
2000
76views more  IPM 2000»
13 years 7 months ago
Structured storage and retrieval of SGML documents using Grove
SGML standardized in ISO 8879 [International Organization for Standardization (1986)] has been proliferated because it can provide various styles and transform documents on dieren...
Hak-Gyoon Kim, Sung-Bae Cho
EWMF
2005
Springer
14 years 27 days ago
Information Retrieval in Trust-Enhanced Document Networks
Abstract. To fight the problem of information overload in huge information sources like large document repositories, e. g. citeseer, or internet websites you need a selection crit...
Klaus Stein, Claudia Hess