Abstract. Effective indexing is crucial for providing convenient access to scanned versions of large collections of handwritten historical manuscripts. Since traditional handwriting recognizers based on Optical Character Recognition (OCR) do not perform well on historical documents, recently a holistic word recognition approach has gained in popularity as an attractive and more straightforward solution [1]. Such techniques attempt to recognize words based on scalar and profilebased features extracted from whole word images. In this paper, we propose a new approach to holistic word recognition for historical handwritten manuscripts based on matching word contours instead of whole images or word profiles. The new method consists of robust extraction of closed word contours and the application of an elastic contour matching technique proposed originally for general shapes [2]. We demonstrate that contour-based descriptors can effectively capture intrinsic word features. Our experiments ...
Tomasz Adamek, Noel E. O'Connor, Alan F. Smeaton