Sciweavers

203 search results - page 8 / 41
» Conceptual-Model-Based Data Extraction from Multiple-Record ...
Sort
View
VLDB
2011
ACM
251views Database» more  VLDB 2011»
13 years 2 months ago
Harvesting relational tables from lists on the web
A large number of web pages contain data structured in the form of “lists”. Many such lists can be further split into multi-column tables, which can then be used in more seman...
Hazem Elmeleegy, Jayant Madhavan, Alon Y. Halevy
SDM
2007
SIAM
73views Data Mining» more  SDM 2007»
13 years 9 months ago
Sketching Landscapes of Page Farms
The Web is a very large social network. It is important and interesting to understand the “ecology” of the Web: the general relations of Web pages to their environment. The un...
Bin Zhou 0002, Jian Pei
BNCOD
2006
88views Database» more  BNCOD 2006»
13 years 9 months ago
The Lixto Project: Exploring New Frontiers of Web Data Extraction
The Lixto project is an ongoing research effort in the area of Web data extraction. Whereas the project originally started out with the idea to develop a logic-based extraction lan...
Julien Carme, Michal Ceresna, Oliver Frölich,...
WWW
2005
ACM
14 years 8 months ago
Using visual cues for extraction of tabular data from arbitrary HTML documents
We describe a method to extract tabular data from web pages. Rather than just analyzing the DOM tree, we also exploit visual cues in the rendered version of the document to extrac...
Bernhard Krüpl, Marcus Herzog, Wolfgang Gatte...
COMPSAC
2003
IEEE
14 years 27 days ago
A Supervised Visual Wrapper Generator for Web-Data Extraction
Extracting data from Web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interest. In this paper, we propose a novel sch...
Xiaofeng Meng, Haiyan Wang, Dongdong Hu, Chen Li