Abstract. Long DNA sequences have to be cut using restriction enzymes into small fragments whose lengths and/or nucleotide sequences can be analyzed by currently available technology. Cutting multiple copies of the same long DNA sequence using different restriction enzymes yields many fragments with overlaps that allow the fragments to be assembled into the order as they appear on the original DNA sequence. This basic idea allows several NP-complete abstractions of the genome map assembly problem. However, it is not obvious which variation is computationally the best in practice. By extensive computer experiments, we show that in the average case the running time of a constraint automata sof the big-bag matching abstraction increases linearly, while the running time of a greedy search solution of the shortest common supern an overlap multigraph abstraction increases exponentially with of real genome input data. Hence the first abstraction is much more efficient computationally for ve...
Viswanathan Ramanathan, Peter Z. Revesz