Abstract. In this article, we propose the use of suffix arrays to efficiently implement n-gram language models with practically unlimited size n. This approach, which is used with ...
A new and conceptually simple data structure, called a suffix array, for on-line string searches is introduced in this paper. Constructing and querying suffix arrays is reduced to...
In this paper we present in detail a new efficient linear time and space suffix array construction algorithm(SACA), called the D-CriticalSubstring algorithm. The algorithm is built...
Given string T = T[1, . . . , n], the suffix sorting problem is to lexicographically sort the suffixes T[i, . . . , n] for all i. This problem is central to the construction of suf...
Record linkage is an important data integration task that has many practical uses for matching, merging and duplicate removal in large and diverse databases. However, a quadratic ...
Timothy de Vries, Hui Ke, Sanjay Chawla, Peter Chr...