Sciweavers

843 search results - page 53 / 169
» Segmentation of Compressed Documents
Sort
View
DAS
2010
Springer
14 years 17 days ago
Associating figures with descriptions for patent documents
Patent document images maintained by the U.S. patent database have a specific format, in which figures and text descriptions are separated into different sections. This makes it...
Linlin Li, Chew Lim Tan
ICDAR
2009
IEEE
14 years 2 months ago
Text Lines and Snippets Extraction for 19th Century Handwriting Documents Layout Analysis
In this paper we propose a new approach to improve electronic editions of human science corpus, providing an efficient estimation of manuscripts pages structure. In any handwriti...
Vincent Malleron, Véronique Eglin, Hubert E...
WWW
2006
ACM
14 years 8 months ago
Compressing and searching XML data via two zips
XML is fast becoming the standard format to store, exchange and publish over the web, and is getting embedded in applications. Two challenges in handling XML are its size (the XML...
Paolo Ferragina, Fabrizio Luccio, Giovanni Manzini...
SIGIR
2009
ACM
14 years 2 months ago
Compression-based document length prior for language models
The inclusion of document length factors has been a major topic in the development of retrieval models. We believe that current models can be further improved by more refined est...
Javier Parapar, David E. Losada, Alvaro Barreiro
ICDAR
1997
IEEE
13 years 12 months ago
Document image similarity and equivalence detection
A hierarchical algorithm is presented for determining the similarity and equivalence of document images. Features extracted from the CCIIT fax-compressed representations of two im...
Jonathan J. Hull, John F. Cullen