Abstract. Syllable based text compression is a new approach to compression by symbols. In this concept syllables are used as the compression symbols instead of the more common char...
We present a new statistical compression method, which we call Phrase Based Dense Code (PBDC), aimed at compressing large digital libraries. PBDC compresses the text collection to ...
We present a fast compression and decompression technique for natural language texts. The novelty is that the exact search can be done on the compressed text directly, using any k...
Edleno Silva de Moura, Gonzalo Navarro, Nivio Zivi...
Text compression algorithms are normally defined in terms of a source alphabet of 8-bit ASCII codes. We consider choosing to be an alphabet whose symbols are the words of Englis...
We describe a segmentation method and associated file format for storing images of color documents. We separate each page of the document into three layers, containing the backgro...
Daniel P. Huttenlocher, Pedro F. Felzenszwalb, Wil...