Two different approaches to determining the human genome are currently being pursued: one is the “clone-by-clone” approach, employed by the publicly-funded Human Genome Project, and the other is the “whole genome shotgun” approach, favored by researchers at Celera Genomics. An interim strategy employed at Celera, called compartmentalized assembly, makes use of preliminary data produced by both approaches. This paper introduces the Bactig Ordering Problem, which is a key problem that arises in this context, and presents an efficient heuristic called the greedy path-merging algorithm that performs well on real data.
Daniel H. Huson, Knut Reinert, Eugene W. Myers