We define a family of embedded domain specific languages for generating HTML and XML documents. Each language is implemented as a combinator library in Haskell. The generated HTML...
: We propose a method for text retrieval from document images without the use of OCR. Documents are segmented into character objects. Image features, namely the Vertical Traverse D...
In this paper, we propose an accurate and suitable designed system for complex documents segmentation. This system is based on steerable pyramid transform. The features extracted ...
The digital world enables the creation of personalized documents. In this paper we are interested in describing a computer mediated activity by a person throughout a semi-automati...
We propose a novel and efficient solution to the problem of clustering XML documents based on their structure. We use operations on multisets of paths of document trees to define...