Sciweavers

587 search results - page 88 / 118
» Categorisation of web documents using extraction ontologies
Sort
View
EMNLP
2004
13 years 9 months ago
Instance-Based Question Answering: A Data-Driven Approach
Anticipating the availability of large questionanswer datasets, we propose a principled, datadriven Instance-Based approach to Question Answering. Most question answering systems ...
Lucian Vlad Lita, Jaime G. Carbonell
DOCENG
2004
ACM
14 years 1 months ago
A reduced yet extensible audio-visual description language
Enabling an intelligent access to multimedia data requires a powerful description language. In this paper, we demonstrate why the MPEG-7 standard fails to fulfill this task. We i...
Raphaël Troncy, Jean Carrive
GREC
2009
Springer
13 years 11 months ago
Interactive Conversion of Web Tables
Two hundred web tables from ten sites were imported into Excel. The tables were edited as needed, then converted into layout independent Wang using the Table Abstraction Tool (TAT)...
Raghav Krishna Padmanabhan, Ramana Chakradhar Jand...
WSDM
2010
ACM
215views Data Mining» more  WSDM 2010»
14 years 5 months ago
Boilerplate Detection using Shallow Text Features
In addition to the actual content Web pages consist of navigational elements, templates, and advertisements. This boilerplate text typically is not related to the main content, ma...
Christian Kohlschütter, Peter Fankhauser, Wol...
AND
2009
13 years 5 months ago
Digital weight watching: reconstruction of scanned documents
A web-portal providing access to over 250.000 scanned and OCRed cultural heritage documents is analyzed. The collection consists of the complete Dutch Hansard from 1917 to 1995. E...
Tim Gielissen, Maarten Marx