Abstract. The increasing flow of digital information requires the extraction, filtering and classification of pertinent information from large volumes of texts. An important pre...
An algorithm is presented that automatically matches images of presentation slides to the symbolic source file (e.g., PowerPointTM or AcrobatTM ) from which they were generated. T...
This paper presents an algorithm to generate possible variants for biomedical terms. The algorithm gives each variant its generation probability representing its plausibility, whi...
An overwhelming number of legal documents is available in digital form. However, most of the texts are usually only provided in a semi-structured form, i.e. the documents are stru...
We consider a system in which many users run queries to examine subsets of a large object set. The object set is partitioned into files on tape. A single subset of objects will b...