We consider methods for compressing parse trees, especially techniques based on statistical modeling. We regard a sequence of productions corresponding to a suffix of the path fr...
Thispaper describes speed-up of string pattern matchingby rearrangingstates inAho-Corasickpattern matching machine, which is a kind of afinite automaton. Werealized speed-up of st...
T. Nishimura, Shuichi Fukamachi, Takeshi Shinohara
Word-based Huffman coding has widespread use in information retrieval systems. Besides its compressing power, it also enables the implementation of both indexing and searching sch...
Abstract. This paper examines a conflation method based on the N-grams approach and evaluates its performance relative to the results achieved by other techniques such as Porter a...
We address the problem of musical sequence comparison for melodic similarity. Starting with a very simple similarity measure, we improve it step-by-step to finally obtain an acce...
T. Kadota, Masahiro Hirao, Akira Ishino, Masayuki ...
The Compact Directed Acyclic Word Graph (CDAWG) is a space efficient data structure that supports indices of a string. The Symmetric Directed Acyclic Word Graph (SCDAWG) for a st...
We constructively prove the exact distribution of deletion sizes for unavoidable strings, under the reductive decidability method of Zimin and Bean et al. Bounds such as these on ...
Most queries to text search engines are ranked or Boolean. Phrase querying is a powerful technique for refining searches, but is expensive to implement on conventional indexes. I...
In this paper, we study query processing in a distributed text database. The novelty is a real distributed architecture implementation that offers concurrent query service. The di...
Claudine Santos Badue, Ricardo A. Baeza-Yates, Ber...