Microformats and semantic XHTML add semantics to web pages while taking advantage of the existing (X)HTML infrastructure. This approach enables new applications that can be deploy...
This paper presents a language identification technique that detects Latin-based languages of imaged documents without OCR. The proposed technique detects languages through the wo...
We propose a novel approach that identifies web page templates and extracts the unstructured data. Extracting only the body of the page and eliminating the template increases the ...
There is an approach of annotation extraction from printed documents in which annotations are extracted by comparing the image of an annotated document and its original document i...
Automatic generation of formal specifications from requirement reduces cost and complexity of formal models creation. Thus, the generated formal model brings the possibility to ca...