A new text line location and separation algorithm for complex handwritten documents is proposed. The algorithm is based on the application of a fuzzy directional runlength. The pr...
Four methods of converting paper documents to computer-readable form are compared with regard to hypothetical labor cost: keyboarding, omnifont OCR, stylespecific OCR, and style-c...
Large collections of scanned documents (books and journals) are now available in Digital Libraries. The most common method for retrieving relevant information from these collectio...
Obsolescence in storage media and the hardware and software for access and use can render old electronic files inaccessible and unusable. Therefore, the long-term preservation of ...
A great number of documents are scanned and archived in the form of digital images in digital libraries, to make them available and accessible in the Internet. Information retriev...
Most offline handwriting recognition approaches proceed by segmenting words into smaller pieces (usually characters) which are recognized separately. The recognition result of a w...
Today's digital libraries increasingly include not only printed text but also scanned handwritten pages and other multimedia material. There are, however, few tools available...
PDF became a very common format for exchanging printable documents. Further, it can be easily generated from the major documents formats, which make a huge number of PDF documents...
In this paper, a method based on part-of-speech tagging (PoS) is used for bibliographic reference structure. This method operates on a roughly structured ASCII file, produced by O...