Sciweavers

SAC
2005
ACM

Automatic wrapper maintenance for semi-structured web sources using results from previous queries

14 years 5 months ago
Automatic wrapper maintenance for semi-structured web sources using results from previous queries
During the last years, significant attention has been paid to the problem of building wrappers for extracting data from semistructured web sources. Nevertheless, since web sources are autonomous, they may experience changes that invalidate the wrappers. In this paper, we present new heuristics and algorithms to address the problem of automatic wrapper maintenance. Our approach is based on collecting query results during wrapper operation and using them later to generate new sets of examples that can be used to induce a new wrapper when the source changes. Categories and Subject Descriptors H.2.5 [Database Management]: Heterogeneous Databases. H.2.8 [Database Management]: Database Applications – Data mining. General Terms Algorithms, Design, Experimentation. Keywords Web, extraction, wrapper, maintenance, examples.
Juan Raposo, Alberto Pan, Manuel Álvarez, &
Added 26 Jun 2010
Updated 26 Jun 2010
Type Conference
Year 2005
Where SAC
Authors Juan Raposo, Alberto Pan, Manuel Álvarez, Ángel Viña
Comments (0)