Sciweavers

139 search results - page 6 / 28
» Semi-Automatic Wrapper Generation for Internet Information S...
Sort
View
WEBI
2005
Springer
14 years 1 months ago
ITPilot: A Toolkit for Industrial-Strength Web Data Extraction
In recent years, many research systems have been proposed to perform data extraction and automation tasks on Web sources. Since most of today’s Web sources are “human-readable...
Alberto Pan, Juan Raposo, Manuel Álvarez, P...
ER
1999
Springer
155views Database» more  ER 1999»
14 years 1 hour ago
XML-based Components for Federating Multiple Heterogeneous Data Sources
Several federated database systems have been built in the past using the relational or the object model as federating model. This paper gives an overview of the XMLMedia system, a ...
Georges Gardarin, Fei Sha, Tuyet-Tram Dang-Ngoc
AAAI
2007
13 years 10 months ago
Template-Independent News Extraction Based on Visual Consistency
Wrapper is a traditional method to extract useful information from Web pages. Most previous works rely on the similarity between HTML tag trees and induced template-dependent wrap...
Shuyi Zheng, Ruihua Song, Ji-Rong Wen
SIGMOD
1997
ACM
127views Database» more  SIGMOD 1997»
13 years 12 months ago
Infomaster: An Information Integration System
Infomaster is an information integration system that provides integrated access tomultiple distributed heterogeneous information sources on the Internet, thus giving the illusion ...
Michael R. Genesereth, Arthur M. Keller, Oliver M....
ICWE
2009
Springer
14 years 2 months ago
A Layout-Independent Web News Article Contents Extraction Method Based on Relevance Analysis
Abstract. The traditional Web news article contents extraction methods are time-costly and need much maintenance because they analyze the layout of news pages to generate the wrapp...
Hao Han, Takehiro Tokuda