Sciweavers

309 search results - page 7 / 62
» Discovering informative content blocks from Web documents
Sort
View
DOCENG
2009
ACM
14 years 2 months ago
Web document text and images extraction using DOM analysis and natural language processing
: © Web Document Text and Images Extraction using DOM Analysis and Natural Language Processing Parag Mulendra Joshi, Sam Liu HP Laboratories HPL-2009-187 Web page text extraction,...
Parag Mulendra Joshi, Sam Liu
WWW
2004
ACM
14 years 8 months ago
Web page summarization using dynamic content
Summarizing web pages have recently gained much attention from researchers. Until now two main types of approaches have been proposed for this task: content- and context-based met...
Adam Jatowt
WWW
2010
ACM
13 years 7 months ago
Exploiting content redundancy for web information extraction
We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...
Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...
ICDE
2000
IEEE
99views Database» more  ICDE 2000»
14 years 8 months ago
XWRAP: An XML-Enabled Wrapper Construction System for Web Information Sources
This paper describes the methodology and the software development of XWRAP, an XML-enabled wrapper construction system for semi-automatic generation of wrapper programs. By XML-ena...
Ling Liu, Calton Pu, Wei Han
ICTAI
2000
IEEE
13 years 12 months ago
Reverse mapping of referral links from storage hierarchy for Web documents
In world wide web, a document is usually made up of multiple pages, each one of which has a unique URL address and links to each other by hyperlink pointers. Related documents are...
Chen Ding, Chi-Hung Chi, Vincent Tam