Piers: An Efficient Model for Similarity Search in DNA Sequence Databases

16 years 7 months ago

Download www.comp.nus.edu.sg

Growing interest in genomic research has resulted in the creation of huge biological sequence databases. In this paper, we present a hash-based pier model for efficient homology search in large DNA sequence databases. In our model, only certain segments in the databases called `piers' need to be accessed during searches as opposite to other approaches which require a full scan on the biological sequence database. To further improve the search efficiency, the piers are stored in a specially designed hash table which helps to avoid expensive alignment operation. The hash table is small enough to reside in main memory, hence avoiding I/O in the search steps. We show theoretically and empirically that the proposed approach can efficiently detect biological sequences that are similar to a query sequence with very high sensitivity.

Xia Cao, Shuai Cheng Li, Beng Chin Ooi, Anthony K.

Real-time Traffic

Biological Sequence Databases | Database | DNA Sequence Databases | Query Sequence | SIGMOD 2004 |

claim paper

» The edtree An Index for Large DNA Sequence Databases

» Designing seeds for similarity search in genomic DNA

» Prefix Tree Indexing for Similarity Search and Similarity Joins on Genomic Data

» Effective Indexing and Filtering for Similarity Search in Large Biosequence Databases

» Towards Effective Indexing for Very Large Video Sequence Database

» Patternbased similarity search for microarray data

» CUDA compatible GPU cards as efficient hardware accelerators for SmithWaterman sequence al...

» Macromolecular sequence analysis using multiwindow Gabor representations

Post Info
More Details (n/a)

Added	08 Dec 2009
Updated	08 Dec 2009
Type	Conference
Year	2004
Where	SIGMOD
Authors	Xia Cao, Shuai Cheng Li, Beng Chin Ooi, Anthony K. H. Tung

Comments (0)

Sciweavers

Piers: An Efficient Model for Similarity Search in DNA Sequence Databases

Biological Sequence Databases | Database | DNA Sequence Databases | Query Sequence | SIGMOD 2004 |

Explore & Download

Productivity Tools

Sciweavers