Automated extraction of structured data from Web sources often leads to large heterogeneous knowledge bases (KB), with data and schema items numbering in the hundreds of thousands...
Systems designed for efficient retrieval of conventional data can be very inefficient at retrieving documents. Documents have more complex structure than conventional data, and th...
Given a set D = {d1, d2, ..., dD} of D strings of total length n, our task is to report the "most relevant" strings for a given query pattern P. This involves somewhat mo...
We describe the integration of a structuredtext retrieval system (TextMachine) into an object-oriented database system (OpenODB). We use the external function capability of the da...
Since WWW encourages hypertext and hypermedia document authoring (e.g. HTML or XML), Web authors tend to create documents that are composed of multiple pages connected with hyperl...