Sciweavers

587 search results - page 88 / 118
» Categorisation of web documents using extraction ontologies
Sort
View
154
Voted
EMNLP
2004
15 years 4 months ago
Instance-Based Question Answering: A Data-Driven Approach
Anticipating the availability of large questionanswer datasets, we propose a principled, datadriven Instance-Based approach to Question Answering. Most question answering systems ...
Lucian Vlad Lita, Jaime G. Carbonell
135
Voted
DOCENG
2004
ACM
15 years 9 months ago
A reduced yet extensible audio-visual description language
Enabling an intelligent access to multimedia data requires a powerful description language. In this paper, we demonstrate why the MPEG-7 standard fails to fulfill this task. We i...
Raphaël Troncy, Jean Carrive
128
Voted
GREC
2009
Springer
15 years 6 months ago
Interactive Conversion of Web Tables
Two hundred web tables from ten sites were imported into Excel. The tables were edited as needed, then converted into layout independent Wang using the Table Abstraction Tool (TAT)...
Raghav Krishna Padmanabhan, Ramana Chakradhar Jand...
106
Voted
WSDM
2010
ACM
215views Data Mining» more  WSDM 2010»
16 years 27 days ago
Boilerplate Detection using Shallow Text Features
In addition to the actual content Web pages consist of navigational elements, templates, and advertisements. This boilerplate text typically is not related to the main content, ma...
Christian Kohlschütter, Peter Fankhauser, Wol...
150
Voted
AND
2009
15 years 1 months ago
Digital weight watching: reconstruction of scanned documents
A web-portal providing access to over 250.000 scanned and OCRed cultural heritage documents is analyzed. The collection consists of the complete Dutch Hansard from 1917 to 1995. E...
Tim Gielissen, Maarten Marx