Burst tries: a fast, efficient data structure for string keys

15 years 6 months ago

Download goanna.cs.rmit.edu.au

Many applications depend on efficient management of large sets of distinct strings in memory. For example, during index construction for text databases a record is held for each distinct word in the text, containing the word itself and information such as counters. We propose a new data structure, the burst trie, that has significant advantages over existing options for such applications: it requires no more memory than a binary tree; it is as fast as a trie; and, while not as fast as a hash table, a burst trie maintains the strings in sorted or near-sorted order. In this paper we describe burst tries and explore the parameters that govern their performance. We experimentally determine good choices of parameters, and compare burst tries to other structures used for the same task, with a variety of data sets. These experiments show that the burst trie is particularly effective for the skewed frequency distributions common in text collections, and dramatically outperforms all other data...

Steffen Heinz, Justin Zobel, Hugh E. Williams

Real-time Traffic

Binary Trees | Burst Trie | Data Structures | TOIS 2002 |

claim paper

Added	23 Dec 2010
Updated	23 Dec 2010
Type	Journal
Year	2002
Where	TOIS
Authors	Steffen Heinz, Justin Zobel, Hugh E. Williams

Sciweavers

Burst tries: a fast, efficient data structure for string keys

Binary Trees | Burst Trie | Data Structures | TOIS 2002 |

Explore & Download

Productivity Tools

Sciweavers