Persistent Indexing Technology for Large Sequences

14 years 2 months ago

Download www.dcs.gla.ac.uk

There are two aspects to the work being presented here. The ﬁrst is a novel persistent index structure for genomic data, a prototype of which has been completed. The second, using this index as an example, is a generic index development framework, which is under construction. We propose a variation of the suﬃx tree, the Top Compressed Suﬃx Tree, which has been designed to allow the on-disk construction of indexes over multi-gigabyte sequences. This form of the suﬃx tree extends the work of Hunt et al. [1] by improving the performance of the partitioned construction algorithm when the size of the sequence being indexed is comparable to that of the available main memory, and by providing a compact representation of the index on secondary memory. This work forms part of the GIDOF project—a project to provide a Generic Index Development and Operation Framework. GIDOF addresses the management of performance-critical parameters, automatic parameter exploration and tuning, and the p...

Robert Japp

Real-time Traffic

BNCOD 2003 | BNCOD 2007 | Generic Index Development | Index Development Framework | Suﬃx Tree |

claim paper

Post Info
More Details (n/a)

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2003
Where	BNCOD
Authors	Robert Japp

Comments (0)

Sciweavers

Persistent Indexing Technology for Large Sequences

BNCOD 2003 | BNCOD 2007 | Generic Index Development | Index Development Framework | Suﬃx Tree |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers