The problem of joint modeling the text and image components of multimedia documents is studied. The text component is represented as a sample from a hidden topic model, learned wi...
Nikhil Rasiwasia, Jose Costa Pereira, Emanuele Cov...
In document image understanding, public datasets with ground-truth are an important part of scientific work. They are not only helpful for developing new methods, but also provid...
Thomas Strecker, Joost van Beusekom, Sahin Albayra...
A large amount of handwritten documents exist in image form, as scanned documents. The supporting electronic media allows for better preservation, but to access their content they...
Antonio Clavelli, Luigi P. Cordella, Claudio De St...
We describe our work on text-image alignment in context of building a historical document retrieval system. We aim at aligning images of words in handwritten lines with their text...
This paper describes a complete system for reading typewritten lexicon words in noisy images - in this case museum index cards. The system is conceptually simple, and straightforw...