Image anchor templates are used in document image analysis for document classification, data localization, and other tasks. Current tools allow human operators to mark out small s...
Most classification algorithms are best at categorizing the Web documents into a few categories, such as the top two levels in the Open Directory Project. Such a classification me...
In order to reduce human efforts, there has been increasing interest in applying active learning for training text classifiers. This paper describes a straightforward active learni...
Zhao Xu, Kai Yu, Volker Tresp, Xiaowei Xu, Jizhi W...
Text documents can be watermarked by patterning the inter-word spaces. This paper proposes a text watermarking algorithm that exploits the novel concepts of word classification an...
A segmentation algorithm, which can detect different regions of a handwritten document such as text lines, tables and sketches will be extremely useful in a variety of application...