Many biologically motivated problems are expressed as dynamic programming recurrences and are difficult to parallelize due to the intrinsic data dependencies in their algorithms. Therefore their solutions have been sped up using task level parallelism only. Emerging platforms such as GPUs are appealing parallel architectures for high-performance; at the same time they are a motivation to rethink the algorithms associated with these problems, to extract finergrained parallelism such as data parallelism. In this paper, we consider the hmmersearch program as a representative of these problems and we re-design its computational algorithm to extract data parallelism for a more efficient execution on emerging platforms, despite the fact that hmmersearch has data dependencies. Our approach outperforms other existing methods when searching a very large database of unsorted sequences on GPUs.
Narayan Ganesan, Roger D. Chamberlain, Jeremy Buhl