Structured documents, especially the XML documents, are made up of a few logical components, such as title, sections, subsections and paragraphs. The components in each structured...
—In this paper, we propose a novel method for extracting handwritten characters from multi-language document images, which may contain various types of characters, e.g. Chinese, ...
Yonghong Song, Guilin Xiao, Yuanlin Zhang, Lei Yan...
Document representations can rapidly become unwieldy if they try to encapsulate all possible document properties, ranging tract structure to detailed rendering and layout. We pres...
This paper presents a new technique to improve the combination of classification decisions obtained from local analysis of patterns. Specifically, a genetic algorithm is used to d...
Giovanni Dimauro, Sebastiano Impedovo, Raffaele Mo...
In document image understanding, public datasets with ground-truth are an important part of scientific work. They are not only helpful for developing new methods, but also provid...
Thomas Strecker, Joost van Beusekom, Sahin Albayra...