In today's world, form processing systems must be able to recognize mutant forms that appear to be based on differing templates but are actually only a variation of the origi...
This paper presents a transaction-time HTTP server, called ? Apache that supports document versioning. A document often consists of a main file formatted in HTML or XML and severa...
Document clustering has been used for better document retrieval, document browsing, and text mining. In this paper, we investigate if biomedical ontology MeSH improves the cluster...
Historical sound documents are of high importance for our cultural heritage. The sound of phonographic records is usually extracted by a stylus following the groove, but many old r...
The problem of joint modeling the text and image components of multimedia documents is studied. The text component is represented as a sample from a hidden topic model, learned wi...
Nikhil Rasiwasia, Jose Costa Pereira, Emanuele Cov...