Conventional optical character recognition (OCR) systems operate on individual characters and words, and do not normally exploit document or collection context. We describe a Coll...
K. Pramod Sankar, C. V. Jawahar, Raghavan Manmatha
Word segmentation is a crucial step for segmentation-free document analysis systems and is used for creating an index based on word matching. In this paper, we propose a novel met...
This work addresses the problem of acquiring, indexing and retrieving slides in the context of automatic oral presentation processing. Since the most suitable acquisition techniqu...
N. Daddaoua, Jean-Marc Odobez, Alessandro Vinciare...
While systems supporting communities of practice in work organizations have been shown to be desirable many, if not all, are decoupled from daily work practices and tools. This hi...
We present a system that classifies pixels in a document image according to marking type such as machine print, handwriting, and noise. A segmenter module first splits an input ...