—In this paper, we propose a novel method for extracting handwritten characters from multi-language document images, which may contain various types of characters, e.g. Chinese, ...
Yonghong Song, Guilin Xiao, Yuanlin Zhang, Lei Yan...
Repetition of layout structure is prevalent in document images. In document design, such repetition conveys the underlying logical and functional structure of the data. For exampl...
Document-centric XML is a mixture of text and structure. With the increased availability of document-centric XML content comes a need for query facilities in which both structural...
Jaap Kamps, Maarten Marx, Maarten de Rijke, Bö...
Images are increasingly being embedded in HTML documents on the WWW. Such documents over the WWW essentially provides a rich source of image collection from which users can query....
Facing the retrieval problem according to the overwhelming set of documents online the adaptation of text categorization to web units has recently been pushed. The aim is to utiliz...