We present the algorithmic core of a full text data base that allows fast Boolean queries, phrase queries, and document reporting using less space than the input text. The system ...
There is a close relationship between formal language theory and data compression. Since 1990's various types of grammar-based text compression algorithms have been introduced...
Given two strings (a text t of length n and a pattern p) and a natural number w, window subsequence problems consist in deciding whether p occurs as a subsequence of t and/or findi...
Sentence compression is the task of producing a summary at the sentence level. This paper focuses on three aspects of this task which have not received detailed treatment in the l...
Variants of Huffman codes where words are taken as the source symbols are currently the most attractive choices to compress natural language text databases. In particular, Tagged...