Readers interested in the context of an event covered in the news such as the dismissal of a lawsuit can benefit from easily finding out about the overall news situation, the lega...
This paper describes a research effort to improve the use of the cosine similarity information retrieval technique to detect unknown, known or variances of known rogue software by...
We propose a method to evaluate queries using a last-resort semantic cache in a distributed Web search engine. The cache stores a group of frequent queries and for each of these qu...
Abstract. This paper presents a user study that evaluated the effectiveness of an aggregated search interface in the context of non-navigational search tasks. An experimental syst...
A new trend in the field of pattern matching is to design indexing data structures which take space very close to that required by the indexed text (in entropy-compressed form) an...
Wing-Kai Hon, Rahul Shah, Sharma V. Thankachan, Je...
Abstract. Term-partitioned indexes are generally inefficient for the evaluation of conjunctive queries, as they require the communication of long posting lists. On the other side, ...
Given a pattern p over an alphabet Σp and a text t over an alphabet Σt, we consider the problem of determining a mapping f from Σp to Σ+ t such that t = f(p1)f(p2) . . . f(pm)....
A bitext, or bilingual parallel corpus, consists of two texts, each one in a different language, that are mutual translations. Bitexts are very useful in linguistic engineering bec...
Collaborative filtering (CF) shares information between users to provide each with recommendations. Previous work suggests using sketching techniques to handle massive data sets i...
The consensus string problem is finding a representative string (consensus) of a given set S of strings. In this paper we deal with the consensus string problems optimizing both d...
Amihood Amir, Gad M. Landau, Joong Chae Na, Heejin...