This paper is concerned with the efficient execution of multiplesequence alignmentmethodsin a multipleclientenvironment. Multiple sequence alignment (MSA) is a computationally expensive method, which is commonly used in computational and molecular biology. Large databases of protein and gene sequences are available to the scientific community. Oftentimes, these databases are accessed by multiple users to execute MSA queries. The data server has to handle multipleconcurrent queries in such situations. We look at the effect of data caching on the performance of the data server. We describe an approach for caching intermediate results for reuse in subsequent or concurrent queries. We focus on progressive alignment-based strategies, in particular the CLUSTAL W algorithm. Our results for 350 sets of sequences show an average speedup of up to 2.5 is obtained by caching intermediate results. Our results also show that the cache-enabled CLUSTAL W program scales well on a SMP machine.
Ümit V. Çatalyürek, Eric Stahlber