Sciweavers

DASFAA
2005
IEEE

Indexing DNA Sequences Using q-Grams

14 years 6 months ago
Indexing DNA Sequences Using q-Grams
We have observed in recent years a growing interest in similarity search on large collections of biological sequences. Contributing to the interest, this paper presents a method for indexing the DNA sequences efficiently based on q-grams to facilitate similarity search in a DNA database and sidestep the need for linear scan of the entire database. Two level index – hash table and c-trees – are proposed based on the q-grams of DNA sequences. The proposed data structures allow the quick detection of sequences within a certain distance to the query sequence. Experimental results show that our method is efficient in detecting similarity regions in a DNA sequence database with high sensitivity.
Xia Cao, Shuai Cheng Li, Anthony K. H. Tung
Added 24 Jun 2010
Updated 24 Jun 2010
Type Conference
Year 2005
Where DASFAA
Authors Xia Cao, Shuai Cheng Li, Anthony K. H. Tung
Comments (0)