Maintaining Web Navigation Flows for Wrappers

16 years 20 days ago

Download www.tic.udc.es

A substantial subset of the web data follows some kind of underlying structure. In order to let software programs gain full benefit from these “semistructured” web sources, wrapper programs are built to provide a “machinereadable” view over them. A significant problem with wrappers is that, since web sources are autonomous, they may experience changes that invalidate the current wrapper, so automatic maintenance is an important research issue. Web wrappers must perform two kinds of tasks: automatically navigating through websites and automatically extracting structured data from HTML pages. While several previous works have addressed the automatic maintenance of the components performing the data extraction task, the problem of automatically maintaining the required web navigation sequences remains unaddressed to the best of our knowledge. In this paper we propose and expirementally validate a set of novel heuristics and algorithms to fill this gap.

Juan Raposo, Manuel Álvarez, José Lo

Real-time Traffic

Automatic Maintenance | DEEC 2006 | Web Data | Web Sources |

claim paper

Added	10 Jun 2010
Updated	10 Jun 2010
Type	Conference
Year	2006
Where	DEEC
Authors	Juan Raposo, Manuel Álvarez, José Losada, Alberto Pan

Sciweavers

Maintaining Web Navigation Flows for Wrappers

Automatic Maintenance | DEEC 2006 | Web Data | Web Sources |

Explore & Download

Productivity Tools

Sciweavers