A new handwritten text database, GERMANA, is presented to facilitate empirical comparison of different approaches to text line extraction and off-line handwriting recognition. G...
Keyphrases are short phrases that reflect the main topic of a document. Because manually annotating documents with keyphrases is a time-consuming process, several automatic appro...
Katja Hofmann, Manos Tsagkias, Edgar Meij, Maarten...
Abstract. We present a dialogue system that enables the access in natural language to a web information retrieval system. We use a Web Semantic Language to model the knowledge conv...
We present a semi-Markov model for recognizing scene text that integrates character and word segmentation with recognition. Using wavelet features, it requires only approximate lo...
Allen R. Hanson, Erik G. Learned-Miller, Jerod J. ...
Traditionally, research in identifying structured entities in documents has proceeded independently of document categorization research. In this paper, we observe that these two t...