This paper presents an efficient indexing and retrieval scheme for searching in document image databases. In many non-European languages, optical character recognizers are not very...
With the WEBSOM method a textual document collection may be organized onto a graphical map display that provides an overview of the collection and facilitates interactive browsing...
Samuel Kaski, Timo Honkela, Krista Lagus, Teuvo Ko...
This paper presents a segmentation-based handwriting recognizer and the performance that it achieves on the numerical fields extracted from a large single-writer historical collec...
Marius Bulacu, Axel Brink, Tijn van der Zant, Lamb...
In recent years, language resources acquired from the Web are released, and these data improve the performance of applications in several NLP tasks. Although the language resource...
Abstract. Large document collections, such as those delivered by Internet search engines, are difficult and time-consuming for users to read and analyse. The detection of common an...