In this paper, we describe how meta-data of indexation can be extracted from historical document images using an interactive process with a software called AGORA. The algorithms i...
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
Abstract. Training and evaluation of techniques for handwriting recognition and retrieval is a challenge given that it is difficult to create large ground-truthed datasets. This is...
In order to preserve our cultural heritage and for automated document processing libraries and national archives have started digitizing historical documents. In the case of degra...
Florian Kleber, Robert Sablatnig, Melanie Gau, Hei...
Portable Document Format (PDF) is a page-oriented, graphically rich format based on PostScript semantics and it is also the format interpreted by the Adobe Acrobat viewers. Althou...
Steven R. Bagley, David F. Brailsford, Matthew R. ...