Sciweavers

HICSS
2002
IEEE

A Novel Method for Detecting Similar Documents

14 years 4 months ago
A Novel Method for Detecting Similar Documents
We describe a system for rapidly determining document similarity among a set of documents obtained from an information retrieval (IR) system. We obtain a ranked list of the most important terms in each document using a rapid phrase recognizer system. We store these in a database and compute document similarity using a simple database query. If the number of terms found to not be contained in both documents is less than some predetermined threshold compared to the total number of terms in the document, these documents are determined to be very similar.
James W. Cooper, Anni Coden, Eric W. Brown
Added 14 Jul 2010
Updated 14 Jul 2010
Type Conference
Year 2002
Where HICSS
Authors James W. Cooper, Anni Coden, Eric W. Brown
Comments (0)