Sciweavers

DIAL
2004
IEEE
136views Image Analysis» more  DIAL 2004»
13 years 11 months ago
Line Separation for Complex Document Images Using Fuzzy Runlength
A new text line location and separation algorithm for complex handwritten documents is proposed. The algorithm is based on the application of a fuzzy directional runlength. The pr...
Zhixin Shi, Venu Govindaraju
DIAL
2004
IEEE
170views Image Analysis» more  DIAL 2004»
13 years 11 months ago
Document Style Census for OCR
Four methods of converting paper documents to computer-readable form are compared with regard to hypothetical labor cost: keyboarding, omnifont OCR, stylespecific OCR, and style-c...
George Nagy, Prateek Sarkar
DIAL
2004
IEEE
170views Image Analysis» more  DIAL 2004»
13 years 11 months ago
A General System for the Retrieval of Document Images from Digital Libraries
Large collections of scanned documents (books and journals) are now available in Digital Libraries. The most common method for retrieving relevant information from these collectio...
Simone Marinai, Emanuele Marino, Francesca Cesarin...
DIAL
2004
IEEE
164views Image Analysis» more  DIAL 2004»
13 years 11 months ago
A Dynamic Feature Generation System for Automated Metadata Extraction in Preservation of Digital Materials
Obsolescence in storage media and the hardware and software for access and use can render old electronic files inaccessible and unusable. Therefore, the long-term preservation of ...
Song Mao, Jongwoo Kim, George R. Thoma
DIAL
2004
IEEE
138views Image Analysis» more  DIAL 2004»
13 years 11 months ago
Retrieving Imaged Documents in Digital Libraries Based on Word Image Coding
A great number of documents are scanned and archived in the form of digital images in digital libraries, to make them available and accessible in the Internet. Information retriev...
Yue Lu, Li Zhang, Chew Lim Tan
DIAL
2004
IEEE
149views Image Analysis» more  DIAL 2004»
13 years 11 months ago
Holistic Word Recognition for Handwritten Historical Documents
Most offline handwriting recognition approaches proceed by segmenting words into smaller pieces (usually characters) which are recognized separately. The recognition result of a w...
Victor Lavrenko, Toni M. Rath, R. Manmatha
DIAL
2004
IEEE
136views Image Analysis» more  DIAL 2004»
13 years 11 months ago
Text Alignment with Handwritten Documents
Today's digital libraries increasingly include not only printed text but also scanned handwritten pages and other multimedia material. There are, however, few tools available...
E. Micah Kornfield, R. Manmatha, James Allan
DIAL
2004
IEEE
156views Image Analysis» more  DIAL 2004»
13 years 11 months ago
Xed: A New Tool for eXtracting Hidden Structures from Electronic Documents
PDF became a very common format for exchanging printable documents. Further, it can be easily generated from the major documents formats, which make a huge number of PDF documents...
Karim Hadjar, Maurizio Rigamonti, Denis Lalanne, R...
DIAL
2004
IEEE
173views Image Analysis» more  DIAL 2004»
13 years 11 months ago
Citation Recognition for Scientific Publications in Digital Libraries
In this paper, a method based on part-of-speech tagging (PoS) is used for bibliographic reference structure. This method operates on a roughly structured ASCII file, produced by O...
Dominique Besagni, Abdel Belaïd