Systolic implementations of dynamic programming solutions that utilize a similarity matrix can achieve appreciable performance with both course- and fine-grain parallelization. A limitation of systolic array design is that score routing between array elements, array I/O bandwidth, and score memory capacity are dependent upon the length of the sequence that can be processed. A novel approach of differential scoring is presented that exploits adjacency and decouples the complexity of score routing and systolic array bandwidth to sequence length. Instead, these design parameters become a function of algorithm sensitivity. As a consequence, the Simile implementation of differential scoring for sequence alignment has reduced score routing, I/O bandwidth, and score storage by 82% for sequences of length 106 and has significantly improved gate count, clock rate, and power utilization per systolic processing element.
Antonio E. de la Serna