We present a new index for approximate string matching. The index collects text q-samples, that is, disjoint text substrings of length q, at fixed intervals and stores their posi...
Gonzalo Navarro, Erkki Sutinen, Jani Tanninen, Jor...
The field of compressed data structures seeks to achieve fast search time, but using a compressed representation, ideally requiring less space than that occupied by the original i...
Abstract. We address the problems of pattern matching and approximate pattern matching in the sketching model. We show that it is impossible to compress the text into a small sketc...
Ziv Bar-Yossef, T. S. Jayram, Robert Krauthgamer, ...
—The Burrows-Wheeler Transform (BWT) is the basis for many of the most effective compression and selfindexing methods used today. A key to the versatility of the BWT is the abili...
Matthias Petri, Gonzalo Navarro, J. Shane Culpeppe...
We present a lossless compression algorithm, GenCompress, for genetic sequences, based on searching for approximate repeats. Our algorithm achieves the best compression ratios for...