A new method for indexing genomes using on-disk suffix trees

14 years 2 months ago

Download webhome.cs.uvic.ca

We propose a new method to build persistent suffix trees for indexing the genomic data. Our algorithm DiGeST (Disk-Based Genomic Suffix Tree) improves significantly over previous work in reducing the random access to the input string and performing only two passes over disk data. DiGeST is based on the two-phase multi-way merge sort paradigm using a concise binary representation of the DNA alphabet. Furthermore, our method scales to larger genomic data than managed before. Categories and Subject Descriptors H.2.4 [Information Systems]: Database Management-Systems; J.3 [Computer Applications]: Life and Medical Sciences General Terms Algorithms, Design, Performance Keywords suffix tree, disk structures, DNA indexing

Marina Barsky, Ulrike Stege, Alex Thomo, Chris Upt

Real-time Traffic

CIKM 2008 | Disk-Based Genomic Suffix | Genomic Data | Information Management | Suffix Tree |

claim paper

Post Info
More Details (n/a)

Added	12 Oct 2010
Updated	12 Oct 2010
Type	Conference
Year	2008
Where	CIKM
Authors	Marina Barsky, Ulrike Stege, Alex Thomo, Chris Upton

Comments (0)

Sciweavers

A new method for indexing genomes using on-disk suffix trees

CIKM 2008 | Disk-Based Genomic Suffix | Genomic Data | Information Management | Suffix Tree |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers