We present a document understanding system in which the arrangement of lines of text and block separators within a document are modeled by stochastic context free grammars. A gram...
John C. Handley, Anoop M. Namboodiri, Richard Zani...
In this paper, we propose an accurate and suitable designed system for complex documents segmentation. This system is based on steerable pyramid transform. The features extracted ...
Abstract. In many contexts today, documents are available in a number of versions. In addition to explicit knowledge that can be queried/searched in documents, these documents also...
Automatic document classification is an important step in organizing and mining documents. Information in documents is often conveyed using both text and images that complement ea...
Most text categorization methods require text content of documents that is often difficult to obtain. We consider "Collaborative Text Categorization", where each document...