Background: Enhancements in sequencing technology have recently yielded assemblies of large genomes including rat, mouse, human, fruit fly, and zebrafish. The availability of larg...
Eric C. Rouchka, Abdelnaby Khalyfa, Nigel G. F. Co...
We present a method for the efficient access to parts of remote files. The efficiency is achieved by using a file format independent compact pattern description, that allows to re...
This paper presents a MapReduce algorithm for computing pairwise document similarity in large document collections. MapReduce is an attractive framework because it allows us to de...
Suffix tree is an important data structure for indexing a long sequence (like a genome sequence) or a concatenation of sequences. It finds many applications in practice, especiall...
Counting in general, and estimating the cardinality of (multi-) sets in particular, is highly desirable for a large variety of applications, representing a foundational block for ...
Nikos Ntarmos, Peter Triantafillou, Gerhard Weikum