Sciweavers

203 search results - page 35 / 41
» Conceptual-Model-Based Data Extraction from Multiple-Record ...
Sort
View
AND
2009
13 years 5 months ago
Digital weight watching: reconstruction of scanned documents
A web-portal providing access to over 250.000 scanned and OCRed cultural heritage documents is analyzed. The collection consists of the complete Dutch Hansard from 1917 to 1995. E...
Tim Gielissen, Maarten Marx
SIGIR
2009
ACM
14 years 2 months ago
Web derived pronunciations for spoken term detection
Indexing and retrieval of speech content in various forms such as broadcast news, customer care data and on-line media has gained a lot of interest for a wide range of application...
Dogan Can, Erica Cooper, Arnab Ghoshal, Martin Jan...
DOCENG
2009
ACM
14 years 2 months ago
Object-level document analysis of PDF files
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
Tamir Hassan
CHI
2006
ACM
14 years 8 months ago
Marmite: end-user programming for the web
A tremendous amount of semi-structured data is available today on the web but is not necessarily in a form which is suitable for a user's tasks. For example, a website may sh...
Jason I. Hong, Jeffrey Wong
CIKM
2008
Springer
13 years 9 months ago
Dr. Searcher and Mr. Browser: a unified hyperlink-click graph
We introduce a unified graph representation of the Web, which includes both structural and usage information. We model this graph using a simple union of the Web's hyperlink ...
Barbara Poblete, Carlos Castillo, Aristides Gionis