Compact Encoding Strategies for DNA Sequence Similarity Search

14 years 2 months ago

Download www.stateslab.org

Determining whether two DNA sequences are similar is an essential component of DNA sequence analysis. Dynamic programming is the algorithm of choice if computational time is not the most important consideration. Heuristic search tools, such as BLAST, are computationally more efficient, but they may miss some of the sequence similarities (Altschul et al., 1990). These tools often use common k-tuples (words) between the two sequences to determine anchor points for the alignment, and spend most of their computational time extending the alignment beyond these anchor points. We discuss and provide a DNA sequence similarity search implementation (called SENSEI) that improves upon the performance of BLASTN by almost an order of magnitude for comparable sensitivity. This improvement is a result of using compactly encoded scoring tables for k-tuples, encoding bases with a single bit, filtering the sequence to remove the simple sequence repeats using XNUN, and masking the known species-specific...

David J. States, Pankaj Agarwal

Real-time Traffic

Computational Biology | DNA Sequence | DNA Sequence Analysis | ISMB 1996 | Query Sequence |

claim paper

Post Info
More Details (n/a)

Added	02 Nov 2010
Updated	02 Nov 2010
Type	Conference
Year	1996
Where	ISMB
Authors	David J. States, Pankaj Agarwal

Comments (0)

Sciweavers

Compact Encoding Strategies for DNA Sequence Similarity Search

Computational Biology | DNA Sequence | DNA Sequence Analysis | ISMB 1996 | Query Sequence |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers