The field of compressed data structures seeks to achieve fast search time, but using a compressed representation, ideally requiring less space than that occupied by the original i...
Structured documents, especially the XML documents, are made up of a few logical components, such as title, sections, subsections and paragraphs. The components in each structured...
Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fit...
In this paper, we describe the design, architecture, and the lessons learned from the implementation of a fast regular expression indexing engine FREE. FREE uses a prebuilt index ...
Electronic mail poses a number of unusual challenges for the design of information retrieval systems and test collections, including informal expression, conversational structure,...