Sciweavers

2677 search results - page 29 / 536
» Extracting Structured Data from Web Pages
Sort
View
WWW
2007
ACM
14 years 9 months ago
U-REST: an unsupervised record extraction system
In this paper, we describe a system that can extract record structures from web pages with no direct human supervision. Records are commonly occurring HTML-embedded data tuples th...
Yuan Kui Shen, David R. Karger
ICTAI
2008
IEEE
14 years 3 months ago
Adaptive Mobile Interfaces through Grammar Induction
This paper presents a grammar-induction based approach to partitioning a Web page into several small pages while each small page fits not only spatially but also logically for mob...
Jun Kong, Kevin L. Ates, Kang Zhang, Yan Gu
WWW
2005
ACM
14 years 9 months ago
Extracting semantic structure of web documents using content and visual information
This work aims to provide a page segmentation algorithm which uses both visual and content information to extract the semantic structure of a web page. The visual information is u...
Rupesh R. Mehta, Pabitra Mitra, Harish Karnick
KDD
2012
ACM
212views Data Mining» more  KDD 2012»
11 years 11 months ago
Harnessing the wisdom of the crowds for accurate web page clipping
Clipping Web pages, namely extracting the informative clips (areas) from Web pages, has many applications, such as Web printing and e-reading on small handheld devices. Although m...
Lei Zhang, Linpeng Tang, Ping Luo, Enhong Chen, Li...
SOFSEM
2007
Springer
14 years 2 months ago
Creating Permanent Test Collections of Web Pages for Information Extraction Research
In the research area of automatic web information extraction, there is a need for permanent and annotated web page collections enabling objective performance evaluation of differen...
Bernhard Pollak, Wolfgang Gatterbauer