Sciweavers

DI
2006

Identifying almost identical files using context triggered piecewise hashing

13 years 11 months ago
Identifying almost identical files using context triggered piecewise hashing
Fuzzy hashing allows the discovery of potentially incriminating documents that may not be located using traditional hashing methods. The use of the fuzzy hash is much like the fuzzy logic search; it is looking for documents that are similar but not exactly the same, called homologous files. Homologous files have identical strings of binary data; however they are not exact duplicates. An example would be two identical word processor documents, with a new paragraph added in the middle of one. To locate homologous files, they must be hashed traditionally in segments to identify the strings of identical data.
Jesse D. Kornblum
Added 11 Dec 2010
Updated 11 Dec 2010
Type Journal
Year 2006
Where DI
Authors Jesse D. Kornblum
Comments (0)