Clio, the IBM Research system for expressing declarative schema mappings, has progressed in the past few years from a research prototype into a technology that is behind some of I...
An ad hoc data format is any non-standard, semi-structured data format for which robust data processing tools are not available. In this paper, we present ANNE, a new kind of mark...
This paper presents a new approach to text processing, based on textemes. These are atomic text units generalising the concepts of character and glyph by merging them in a common ...
Abstract. Despite the considerable success of Inductive Logic Programming, deployed ILP systems still have efficiency problems when applied to complex problems. Several techniques ...
Rui Camacho, Nuno A. Fonseca, Ricardo Rocha, V&iac...
Due to the growing importance of the World Wide Web, archiving it has become crucial for preserving useful source of information. To maintain a web archive up-to-date, crawlers ha...