Sciweavers

ICDE
2007
IEEE

CPS-tree: A Compact Partitioned Suffix Tree for Disk-based Indexing on Large Genome Sequences

14 years 3 months ago
CPS-tree: A Compact Partitioned Suffix Tree for Disk-based Indexing on Large Genome Sequences
Suffix tree is an important data structure for indexing a long sequence (like a genome sequence) or a concatenation of sequences. It finds many applications in practice, especially in the domain of bioinformatics. Suffix tree allows for efficient pattern search with time independentof the sequence length. However, the performance of disk-based suffix tree is a concern as it is slowed down significantly by poor localized access resulting in high IO disk access. The focus of this paper is to design an IO-efficient and Compact Partitioned Suffix tree representation (CPS-tree) on disk. We show that representing suffix tree using CPStree has several advantages. First, our representation allows us to visit any node in the suffix tree by accessing at most log n pages of the tree where n is the length of the sequence. Second, our storage scheme improves the access pattern and reduces the number of page fault resulting in efficient search retrieval and efficient tree traversal operations. Thir...
Swee-Seong Wong, Wing-Kin Sung, Limsoon Wong
Added 16 Aug 2010
Updated 16 Aug 2010
Type Conference
Year 2007
Where ICDE
Authors Swee-Seong Wong, Wing-Kin Sung, Limsoon Wong
Comments (0)