Books and magazines often contain pages containing audacious mixtures of color images and text. Our problem consists in coding the background colors of a such documents without wa...
Spam filtering poses a special problem in text categorization, of which the defining characteristic is that filters face an active adversary, which constantly attempts to evade fi...
Andrej Bratko, Gordon V. Cormack, Bogdan Filipic, ...
Abstract. We study parallel and distributed compressed indexes. Compressed indexes are a new and functional way to index text strings. They exploit the compressibility of the text,...
Abstract. Processing compressed strings without decompression is often essential when dealing with massive data sets. We consider local subsequence recognition problems on strings ...
This paper improves the Tagged Suboptimal Codes (TSC) compression scheme in several ways. We show how to process the TSC as a universal code. We introduce the TSCk as a family of ...