A large amount of handwritten documents exist in image form, as scanned documents. The supporting electronic media allows for better preservation, but to access their content they must be processed by some kind of recognition technologies that convert the image to searchable text. In case of cursively written documents, even the best available technology introduces recognition errors that may drive down the performance of a document retrieval system. We propose a recognition-free approach which embodies two main components: a shape matching algorithm, working on the ink, and a string matching algorithm working on the ink interpretation of a reference set. Experiments on a data set of 16,500 cursive words produced by hundreds of writers show promising results and suggest that the proposed method can be a viable tool to build inexpensive retrieval system for cursive documents.
Antonio Clavelli, Luigi P. Cordella, Claudio De St