Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
Recent work in supervised learning of term-based retrieval models has shown significantly improved accuracy can often be achieved via better model estimation [2, 10, 11, 17]. In ...
Content-based image retrieval (CBIR) is currently limited because of the lack of representational power of the low-level image features, which fail to properly represent the actual...
Abstract. “Hash then encrypt” is an approach to message authentication, where first the message is hashed down using an ε-universal hash function, and then the resulting k-bi...
Benchmarking as a method of assessing software performance is known to suffer from random fluctuations that distort the observed performance. In this paper, we focus on the fluctua...