Preprocessing in handwritten text OCR involves line, word and character segmentation. This paper deals with text line identification of handwritten Indian scripts, especially of B...
We propose a fully automatic method for summarizing and indexing unstructured presentation videos based on text extracted from the projected slides. We use changes of text in the ...
Written documents created through dictation differ significantly from a true verbatim transcript of the recorded speech. This poses an obstacle in automatic dictation systems as s...
Maximilian Bisani, Paul Vozila, Olivier Divay, Jef...
Abstract. Automatic extraction of semantic relationships between entity instances in an ontology is useful for attaching richer semantic metadata to documents. In this paper we pro...
In China-US Million Book Digital Library, output of the digitalization process is more than one terabyte of text in OEB and PDF format. To access these data quickly and accurately,...