We present a probabilistic model for a document corpus that combines many of the desirable features of previous models. The model is called “GaP” for Gamma-Poisson, the distri...
The design of efficient textual similarities is an important issue in the domain of textual data exploration. Textual similarities are for example central in document collection s...
Intelligent access to information requires semantic integration of structured databases with unstructured textual resources. While the semantic integration problem has been widely...
The World Wide Web is growing at such a pace that even the biggest centralized search engines are able to index only a small part of the available documents on the Internet. The d...
Using Dirac Notation as a powerful tool, we investigate the three classical Information Retrieval (IR) models and some their extensions. We show that almost all such models can be...