This paper studies the problem of extracting data from a Web page that contains several structured data records. The objective is to segment these data records, extract data items...
We study methods to initialize or bias different clustering methods using prior information about the "importance" of a keyword w.r.t. the whole document collection or a...
We present a novel framework for automated extraction and approximation of numerical object attributes such as height and weight from the Web. Given an object-attribute pair, we d...
Information Extraction methods can be used to automatically "fill-in" database forms from unstructured data such as Web documents or email. State-of-the-art methods have...
Trausti T. Kristjansson, Aron Culotta, Paul A. Vio...
Background: For ecological studies, it is crucial to count on adequate descriptions of the environments and samples being studied. Such a description must be done in terms of thei...