Web information extraction is a fundamental issue for web information management and integrations. A common approach is to use wrappers to extract data from web pages or documents...
Extracting data from Web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interests. There are two main issues relevant t...
The Distributed Information Search COmponent (Disco) is a prototype heterogeneous distributed database that accesses underlying data sources. The Disco prototype currently focuses...
Extracting data from Web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interest. In this paper, we propose a novel sch...
This paper provides an explanation of the basic data structures used in a new page analysis technique to create wrappers (data extractors) for the result pages produced by web sit...