We describe a system for rapidly determining document similarity among a set of documents obtained from an information retrieval (IR) system. We obtain a ranked list of the most i...
We present an algorithm that takes an unannotated corpus as its input, and returns a ranked list of probable morphologically related pairs as its output. The algorithm tries to di...
MedSearch1 is a complete retrieval system for Medline, the premier bibliographic database of the U.S. National Library of Medicine (NLM). MedSearch implements SSRM, a novel informa...
During this talk, I will introduce a novel family of contextual measures of similarity between distributions: the similarity between two distributions q and p is measured in the co...
Florent Perronnin (Xerox Research Centre Europe), ...
This report explains our plagiarism detection method using fuzzy semantic-based string similarity approach. The algorithm was developed through four main stages. First is pre-proce...