Sciweavers

CIKM
2010
Springer

Index structures for efficiently searching natural language text

13 years 10 months ago
Index structures for efficiently searching natural language text
Many existing indexes on text work at the document granularity and are not effective in answering the class of queries where the desired answer is only a term or a phrase. In this paper, we study some of the index structures that are capable of answering the class of queries referred to here as wild card queries and perform an analysis of their performance. Our experimental results on a large class of queries from different sources (including query logs and parse trees) and with various datasets reveal some of the performance barriers of these indexes. We then present Word Permuterm Index (WPI) which is an adaptation of the permuterm index for natural language text applications and show that this index supports a wide range of wild card queries, is quick to construct and is highly scalable. Our experimental results comparing WPI to alternative methods on a wide range of wild card queries show a few orders of magnitude performance improvements for WPI while the memory usage is kept the...
Pirooz Chubak, Davood Rafiei
Added 10 Feb 2011
Updated 10 Feb 2011
Type Journal
Year 2010
Where CIKM
Authors Pirooz Chubak, Davood Rafiei
Comments (0)