In this paper, we consider the problem of extracting structured data from web pages taking into account both the content of individual attributes as well as the structure of pages...
It is often desirable to extract structured information from raw web pages for better information browsing, query answering, and pattern mining. Many such Information Extraction (...
In this paper, we propose an innovative approach to extracting semi-structured data from Web sources. The idea is to collect a couple of example objects from the user and to use t...
Berthier A. Ribeiro-Neto, Alberto H. F. Laender, A...
We propose a weakly-supervised approach for extracting class attributes from structured text available within Web documents. The overall precision of the extracted attributes is a...
A critical problem in developing information agents for the Web is accessing data that is formatted for human use. We have developed a set of tools for extracting data from web si...
Craig A. Knoblock, Kristina Lerman, Steven Minton,...