Index-Based Approach to Similarity Search in Protein and Nucleotide Databases

14 years 2 months ago

Download siret.ms.mff.cuni.cz

When searching databases of nucleotide or protein sequences, ﬁnding a local alignment of two sequences is one of the main tasks. Since the sizes of available databases grow constantly, the eﬃciency of retrieval methods becomes the critical issue. The sequence retrieval relies on ﬁnding sequences in the database which align best with the query sequence. However, an optimal alignment can be found in quadratic time (by use of dynamic programming) while this is infeasible when dealing with large databases. The existing solutions use fast heuristic methods (like BLAST, FASTA) which produce only an uncontrolled approximation of the best alignment and even do not provide any information about the alignment approximation error. In this paper we propose an approach of exact and approximate indexing using several metric access methods (MAMs) in combination with the TriGen algorithm, in order to reduce the number of alignments (distance computations) needed. The experimental results have sh...

David Hoksza, Tomás Skopal

Real-time Traffic

Alignment Approximation Error | Database | DATESO 2007 | Fast Heuristic Methods | Sequence Retrieval |

claim paper

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2007
Where	DATESO
Authors	David Hoksza, Tomás Skopal

Comments (0)

Sciweavers

Index-Based Approach to Similarity Search in Protein and Nucleotide Databases

Alignment Approximation Error | Database | DATESO 2007 | Fast Heuristic Methods | Sequence Retrieval |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers