Abstract. A set of sequences S is pairwise bounded if the Hamming distance between any pair of sequences in S is at most 2d. The Consensus Sequence problem aims to discern between ...
We propose a method to evaluate queries using a last-resort semantic cache in a distributed Web search engine. The cache stores a group of frequent queries and for each of these qu...
Abstract. This paper presents a user study that evaluated the effectiveness of an aggregated search interface in the context of non-navigational search tasks. An experimental syst...
A new trend in the field of pattern matching is to design indexing data structures which take space very close to that required by the indexed text (in entropy-compressed form) an...
Wing-Kai Hon, Rahul Shah, Sharma V. Thankachan, Je...
Abstract. Term-partitioned indexes are generally inefficient for the evaluation of conjunctive queries, as they require the communication of long posting lists. On the other side, ...
Given a pattern p over an alphabet Σp and a text t over an alphabet Σt, we consider the problem of determining a mapping f from Σp to Σ+ t such that t = f(p1)f(p2) . . . f(pm)....
A bitext, or bilingual parallel corpus, consists of two texts, each one in a different language, that are mutual translations. Bitexts are very useful in linguistic engineering bec...
Collaborative filtering (CF) shares information between users to provide each with recommendations. Previous work suggests using sketching techniques to handle massive data sets i...
The consensus string problem is finding a representative string (consensus) of a given set S of strings. In this paper we deal with the consensus string problems optimizing both d...
Amihood Amir, Gad M. Landau, Joong Chae Na, Heejin...