A new technique to locate content-representing words for a given document image using representation of character shapes is described. A character shape code representation define...
—This paper presents a framework to restore the 2D content printed on documents in the presence of geometric distortion and nonuniform illumination. Compared with text-based docu...
Michael S. Brown, Mingxuan Sun, Ruigang Yang, Lin ...
We report an improved methodology for training classifiers for document image content extraction, that is, the location and segmentation of regions containing handwriting, machine...
This paper proposes a method by which 5WlH (who, when, where, what, why, how, and predicate) information is used to classify and navigate Japaneselanguage texts. 5WlH information,...
Abstract. Automatic extraction of semantic relationships between entity instances in an ontology is useful for attaching richer semantic metadata to documents. In this paper we pro...