An ad hoc data format is any non-standard, semi-structured data format for which robust data processing tools are not available. In this paper, we present ANNE, a new kind of mark...
In this paper a morphological tagging approach for document image invoice analysis is described. Tokens close by their morphology and confirmed in their location within different ...
Abstract. One major goal of text mining is to provide automatic methods to help humans grasp the key ideas in ever-increasing text corpora. To this effect, we propose a statistica...
We describe the design and evaluation of CamWorks, a system that employs a video camera as a means of supporting capture from paper sources during reading and writing. The user ca...
William M. Newman, Christopher R. Dance, Alex S. T...
This paper describes the development of a structured document collection containing user-generated text and numerical metadata for exploring the exploitation of metadata in inform...
Walid Magdy, Jinming Min, Johannes Leveling, Garet...