Sciweavers

CIKM
2008
Springer

A new method for indexing genomes using on-disk suffix trees

14 years 2 months ago
A new method for indexing genomes using on-disk suffix trees
We propose a new method to build persistent suffix trees for indexing the genomic data. Our algorithm DiGeST (Disk-Based Genomic Suffix Tree) improves significantly over previous work in reducing the random access to the input string and performing only two passes over disk data. DiGeST is based on the two-phase multi-way merge sort paradigm using a concise binary representation of the DNA alphabet. Furthermore, our method scales to larger genomic data than managed before. Categories and Subject Descriptors H.2.4 [Information Systems]: Database Management-Systems; J.3 [Computer Applications]: Life and Medical Sciences General Terms Algorithms, Design, Performance Keywords suffix tree, disk structures, DNA indexing
Marina Barsky, Ulrike Stege, Alex Thomo, Chris Upt
Added 12 Oct 2010
Updated 12 Oct 2010
Type Conference
Year 2008
Where CIKM
Authors Marina Barsky, Ulrike Stege, Alex Thomo, Chris Upton
Comments (0)