Sciweavers

2926 search results - page 50 / 586
» Document Analysis
Sort
View
ICDAR
2005
IEEE
14 years 3 months ago
Towards a Canonical and Structured Representation of PDF Documents through Reverse Engineering
This article presents Xed, a reverse engineering tool for PDF documents, which extracts the original document layout structure. Xed mixes electronic extraction methods with state-...
Maurizio Rigamonti, Jean-Luc Bloechle, Karim Hadja...
HT
2003
ACM
14 years 3 months ago
Untangling compound documents on the web
Most text analysis is designed to deal with the concept of a “document”, namely a cohesive presentation of thought on a unifying subject. By contrast, individual nodes on the ...
Nadav Eiron, Kevin S. McCurley
DAS
2004
Springer
14 years 3 months ago
An Integrated Approach for Automatic Semantic Structure Extraction in Document Images
In this paper we present an integrated approach for semantic structure extraction in document images. Document images are initially processed to extract both their layout and logic...
Margherita Berardi, Michele Lapi, Donato Malerba
ICDAR
1997
IEEE
14 years 2 months ago
Enhancing Degraded Document Images via Bitmap Clustering and Averaging
Proper display and accurate recognition of document images are often hampered by degradations caused by poor scanning or transmission conditions. We propose a method to enhance su...
John D. Hobby, Tin Kam Ho
AACC
2004
Springer
14 years 3 months ago
Using Document Dimensions for Enhanced Information Retrieval
Conventional document search techniques are constrained by attempting to match individual keywords or phrases to source documents. Thus, these techniques miss out documents that co...
Thimal Jayasooriya, Suresh Manandhar