With the increasing amount and complexity of data being collected, there is an urgent need to create automated techniques for mining the data. In particular, data being generated ...
We present a fast compression and decompression technique for natural language texts. The novelty is that the exact search can be done on the compressed text directly, using any k...
Edleno Silva de Moura, Gonzalo Navarro, Nivio Zivi...
The nature of data in enterprises and on the Internet is changing. Data used to be stored in a database first and queried later. Today timely processing of new data, represented ...
Scalable similarity search is the core of many large scale learning or data mining applications. Recently, many research results demonstrate that one promising approach is creatin...
We present a novel sequential clustering algorithm which is motivated by the Information Bottleneck (IB) method. In contrast to the agglomerative IB algorithm, the new sequential ...