Anticipating the availability of large questionanswer datasets, we propose a principled, datadriven Instance-Based approach to Question Answering. Most question answering systems ...
Enabling an intelligent access to multimedia data requires a powerful description language. In this paper, we demonstrate why the MPEG-7 standard fails to fulfill this task. We i...
Two hundred web tables from ten sites were imported into Excel. The tables were edited as needed, then converted into layout independent Wang using the Table Abstraction Tool (TAT)...
In addition to the actual content Web pages consist of navigational elements, templates, and advertisements. This boilerplate text typically is not related to the main content, ma...
A web-portal providing access to over 250.000 scanned and OCRed cultural heritage documents is analyzed. The collection consists of the complete Dutch Hansard from 1917 to 1995. E...