This paper presents a language identification technique that detects Latin-based languages of imaged documents without OCR. The proposed technique detects languages through the wo...
Abstract. An approach is presented to guide the benchmarking of invoice analysis systems, a specific, applied subclass of document analysis systems. The state of the art of benchma...
This paper introduces a new method for the rapid development of complex rule bases involving cue phrases for the purpose of classifying text segments. The method is based on Ripple...
SpeechSkimmer is an interactive system for quickly browsing and finding information in speech recordings. Skimming speech recordings is much more difficult than visually scanning ...
For compounding languages, a great part of the topical semantics is conveyed via nominal compounds. Various applications of natural language processing can profit from explicit ac...