This paper presents a dynamic approach to document page segmentation. Current page segmentation algorithms lack the ability to dynamically adapt local variations in the size, orie...
As XML has emerged as a data representation format and as great quantities of data have been stored in the XML format, XML document design has become an important and evident issu...
The manipulation of large-scale document data sets often involves the processing of a wealth of features that correspond with the available terms in the document space. The employm...
We report an automatic feature discovery method that achieves results comparable to a manually chosen, larger feature set on a document image content extraction problem: the locat...
Tables are used to present, list, summarize, and structure important data in documents. In scholarly articles, they are often used to present the relationships among data and high...