The retrieval of similar documents from large scale datasets has been the one of the main concerns in knowledge management environments, such as plagiarism detection, news impact a...
Felipe Bravo-Marquez, Gaston L'Huillier, Sebasti&a...
The retrieval of similar documents in the Web from a given document is different in many aspects from information retrieval based on queries generated by regular search engine use...
Felipe Bravo-Marquez, Gaston L'Huillier, Sebasti&a...
: Web searching techniques have been investigated and implemented in many aspects. Particularly, in case of personalization, more important issue is how to manipulate the results r...
Chonggun Kim, JaeYoun Jung, Hyeon-Cheol Zin, Jason...
: The explosive growth of the World Wide Web, and the resulting information overload, has led to a miniexplosion in World Wide Web search engines. This mini-explosion, in turn, led...
We consider the coverage testing problem where we are given a document and a corpus with a limited query interface and asked to find if the corpus contains a near-duplicate of th...
Ali Dasdan, Paolo D'Alberto, Santanu Kolay, Chris ...