Sciweavers

CORR
2011
Springer

Pattern matching in Lempel-Ziv compressed strings: fast, simple, and deterministic

13 years 6 months ago
Pattern matching in Lempel-Ziv compressed strings: fast, simple, and deterministic
Countless variants of the Lempel-Ziv compression are widely used in many real-life applications. This paper is concerned with a natural modification of the classical pattern matching problem inspired by the popularity of such compression methods: given an uncompressed pattern s[1 . . m] and a Lempel-Ziv representation of a string t[1 . . N], does s occur in t? Farach and Thorup [6] gave a randomized O(n log2 N n + m) time solution for this problem, where n is the size of the compressed representation of t. Building on the methods of [4] and [7], we improve their result by developing a faster and fully deterministic O(n log N n +m) time algorithm. Note that for highly compressible texts, log N n might be of order n, so for such inputs the improvement is very significant. A (tiny) fragment of our method can be used to give an asymptotically optimal solution for the substring hashing problem considered by Farach and Muthukrishnan [5]. Key-words: pattern matching, compression, Lempel-Ziv...
Pawel Gawrychowski
Added 13 May 2011
Updated 13 May 2011
Type Journal
Year 2011
Where CORR
Authors Pawel Gawrychowski
Comments (0)