The closest substring problem, where a short string is sought that minimizes the number of mismatches between it and each of a given set of strings, is a minimization problem with a polynomial time approximation scheme [6]. In this paper, both this problem and its maximization complement, where instead the number of matches is maximized, are examined and bounds on their hardness of approximation are proved. Related problems differing only in their objective functions, seeking either to maximize the number of strings covered by the substring or maximize the length of the substring, are also examined and bounds on their approximability proved. For this last problem of length maximization, the approximation bound of 2 is proved to be tight by presenting a 2-approximation algorithm.
Patricia A. Evans, Andrew D. Smith