Sciweavers

CPM
2005
Springer

Succinct Suffix Arrays Based on Run-Length Encoding

14 years 6 months ago
Succinct Suffix Arrays Based on Run-Length Encoding
A succinct full-text self-index is a data structure built on a text T = t1t2 . . . tn, which takes little space (ideally close to that of the compressed text), permits efficient search for the occurrences of a pattern P = p1 p2 . . . pm in T, and is able to reproduce any text substring, so the self-index replaces the text. Several remarkable self-indexes have been developed in recent years. Many of those take space proportional to nH0 or nHk bits, where Hk is the kth order empirical entropy of T. The time to count how many times does P occur in T ranges from O(m) to O(m log n). In this paper we present a new self-index, called RLFM index for “run-length FMindex”, that counts the occurrences of P in T in O(m) time when the alphabet size is σ = O(polylog(n)). The RLFM index requires nHk log σ + O(n) bits of space, for any
Veli Mäkinen, Gonzalo Navarro
Added 26 Jun 2010
Updated 26 Jun 2010
Type Conference
Year 2005
Where CPM
Authors Veli Mäkinen, Gonzalo Navarro
Comments (0)