The retrieval of similar documents from large scale datasets has been the one of the main concerns in knowledge management environments, such as plagiarism detection, news impact a...
Felipe Bravo-Marquez, Gaston L'Huillier, Sebasti&a...
n explore and understand abstract information spaces as if they were real geographic spaces. According to the distance-similarity metaphor1 one of the most popular spatial metaphor...
Sara Irina Fabrikant, Daniel R. Montello, David M....
We address the problem of minimizing the communication involved in the exchange of similar documents. We consider two users, A and B, who hold documents x and y respectively. Neit...
A promising way to accelerate similarity search is semantic hashing which designs compact binary codes for a large number of documents so that semantically similar documents are ma...
A vast amount of documents in the Web have duplicates, which is a challenge for developing efficient methods that would compute clusters of similar documents. In this paper we use ...