Sciweavers

85 search results - page 5 / 17
» Extracting Content Structure for Web Pages Based on Visual R...
Sort
View
WWW
2007
ACM
14 years 8 months ago
Towards domain-independent information extraction from web tables
Traditionally, information extraction from web tables has focused on small, more or less homogeneous corpora, often based on assumptions about the use of <table> tags. A mul...
Bernhard Krüpl, Bernhard Pollak, Marcus Herzo...
EHCI
2004
13 years 9 months ago
Finding Iteration Patterns in Dynamic Web Page Authoring
Most of the current WWW is made up of dynamic pages. The development of dynamic pages is a difficult and costly endeavour, out-of-reach for most users, experts, and content produce...
José A. Macías, Pablo Castells
DOCENG
2009
ACM
14 years 2 months ago
Object-level document analysis of PDF files
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
Tamir Hassan
DEXA
2010
Springer
226views Database» more  DEXA 2010»
13 years 6 months ago
Vi-DIFF: Understanding Web Pages Changes
Nowadays, many applications are interested in detecting and discovering changes on the web to help users to understand page updates and more generally, the web dynamics. Web archiv...
Zeynep Pehlivan, Myriam Ben Saad, Stéphane ...
AI
2004
Springer
14 years 27 days ago
Term-Based Clustering and Summarization of Web Page Collections
Effectively summarizing Web page collections becomes more and more critical as the amount of information continues to grow on the World Wide Web. A concise and meaningful summary ...
Yongzheng Zhang, A. Nur Zincir-Heywood, Evangelos ...