Sciweavers

CPM
1998
Springer

A Fast Bit-Vector Algorithm for Approximate String Matching Based on Dynamic Programming

14 years 4 months ago
A Fast Bit-Vector Algorithm for Approximate String Matching Based on Dynamic Programming
The approximate string matching problem is to find all locations at which a query of length m matches a substring of a text of length n with k-or-fewer differences. Simple and practical bit-vector algorithms have been designed for this problem, most notably the one used in agrep. These algorithms compute a bit representation of the current state-set of the k-difference automaton for the query, and asymptotically run in either O(nmk/w) or O(nm log /w) time where w is the word size of the machine (e.g., 32 or 64 in practice), and is the size of the pattern alphabet. Here we present an algorithm of comparable simplicity that requires only O(nm/w) time by virtue of computing a bit representation of the relocatable dynamic programming matrix for the problem. Thus, the algorithm's performance is independent of k, and it is found to be more efficient than the previous results for many choices of k and small m. Moreover, because the algorithm is not dependent on k, it can be used to rapid...
Gene Myers
Added 05 Aug 2010
Updated 05 Aug 2010
Type Conference
Year 1998
Where CPM
Authors Gene Myers
Comments (0)