Many applications in Comparative Genomics lend themselves to implementations that take advantage of common high-performance features in modern microprocessors. However, the common suggestion that a dataparallel, multithreaded, or high-throughput implementation is possible often ignores the complexity of actually creating such software. In this paper, we present a dataparallel algorithm for a classic comparative genomics algorithm, the dot plot, along with a multiprocessor extension. For large genomic comparisons, these new algorithms achieve speedups of up to 14.4x over the sequential version. This speedup introduces the opportunity of performing full pairwise comparisons on entire genomes on a much larger scale than previously possible. We also present the experimental, model-driven approach used to develop the algorithm that allowed us to carefully study and evaluate implementation options and fully understand the parameters effecting its performance. KEYWORDS dot plot, data-paralle...
Christopher Mueller, Mehmet M. Dalkilic, Andrew Lu