This paper addresses a content management problem in situations where we have a collection of spoken documents in audio stream format in one language and a collection of related t...
The paper presents a clutter detection and removal algorithm for complex document images. The distance transform based approach is independent of clutter's position, size, sh...
Parallel text is one of the most valuable resources for development of statistical machine translation systems and other NLP applications. The Linguistic Data Consortium (LDC) has...
Recognition and retrieval of historical handwritten material is an unsolved problem. We propose a novel approach to recognizing and retrieving handwritten manuscripts, based upon ...
Handwriting is an alternative method for entering texts composing Short Message Services. However, a whole new language features the texts which are produced. They include for ins...
Emmanuel Prochasson, Christian Viard-Gaudin, Emman...