This paper presents a system that uses the domain name of a German business website to locate its information pages (e.g. company profile, contact page, imprint) and then identifi...
In this paper MULTITALE, a system for the semantic tagging of medical neurosurgical texts and for the semi-automatic expansion of the medical lexicon, will be presented. Given the...
Passage retrieval consists in identifying short but informative runs of a long text, given a specific user query. We discuss the sources of evidence that help choosing likely high-...
Most of Information Retrieval (IR) systems are still based on bag of word paradigm. This is a strong limitation if one needs high precision answers. For example, in restricted doma...
This paper describes a new approach towards detecting plagiarism and scientific documents that have been read but not cited. In contrast to existing approaches, which analyze docu...