Automated extraction of structured data from Web sources often leads to large heterogeneous knowledge bases (KB), with data and schema items numbering in the hundreds of thousands...
abstraction for modeling these problems is to view the Web as a collection of (usually small and heterogeneous) databases, and to view programs that extract and process Web data au...
In this paper we present the design, implementation and evaluation of SOBA, a system for ontology-based information extraction from heterogeneous data resources, including plain t...
Paul Buitelaar, Philipp Cimiano, Anette Frank, Mat...
XML is a widespread W3C standard used by several kinds of applications for data representation and exchange over the web. In the context of a system that provides semantic integrat...
Sandro Daniel Camillo, Carlos A. Heuser, Ronaldo d...
The textual content of the Web enriched with the hyperlink structure surrounding it can be a useful source of information for querying and searching. This paper presents a search ...