We describe Occam, a query planning algorithm that determines the best way to integrate data from dierent sources. As input, Occam takes a library of site descriptions and a user ...
The aim of query-based sampling is to obtain a sufficient, representative sample of an underlying (text) collection. Current measures for assessing sample quality are too coarse gr...
Criminal identity matching is crucial to crime investigation in law enforcement agencies. Existing techniques match identities that refer to the same individuals based on simple i...
Abstract. A base problem in Web information extraction is to find appropriate queries for informative nodes in trees. We propose to learn queries for nodes in trees automatically ...
In this article we define a new type of integrity constraint in XML, called an XML inclusion constraint (XIND), and show that it extends the semantics of a relational inclusion de...
Michael Karlinger, Millist W. Vincent, Michael Sch...