Today, the Internet can be seen as a global marketplace populated by a huge number of providers and consumers that exchange data from a wide range of domains. A combination of dat...
Abstract A rich family of generic Information Extraction (IE) techniques have been developed by researchers nowadays. This paper proposes WebKER, a system for automatically extract...
: Conventional discussion environments provide the technical platform for distributed discussion and collaboration, but apart from some statistical data collected, rarely provide i...
Abstract. Databases, particularly when storing heterogeneous, sparse semistructured data, tend to provide incomplete information and information which is difficult to categorize. T...
Simone Diniz Junqueira Barbosa, Karin Koogan Breit...
The nature of semistructured data in web collections is evolving. Increasingly, XML web documents (or documents exchanged via web services) are valid with regard to a schema, yet ...
Mariano P. Consens, Flavio Rizzolo, Alejandro A. V...