Content and expression-based copy recognition for intellectual property protection

16 years 22 days ago

Download people.csail.mit.edu

Protection of copyrights and revenues of content owners in the digital world has been gaining importance in the recent years. This paper presents a way of fingerprinting text documents that can be used to identify content and expression similarities in documents, as a way of facilitating tracking of digital copies of works, to ensure proper compensation to content owners. The fingerprints we collected consist of surface, syntactic, and semantic features of documents. Because they reflect mostly how things are said, we call these features stylistic fingerprints. However, how things are said are not independent of what is said, therefore these features have predictive power with respect to both content and expression. We tested the ability of these stylistic fingerprints to identify content and expression similarities between documents using a corpus of translated novels. On this corpus, these fingerprints identified the source of a given book chapter (content) successfully 90% of the t...

Özlem Uzuner, Randall Davis

Real-time Traffic

Content Owners | DRM 2003 | Expression Similarities | Stylistic Fingerprints |

claim paper

Added	06 Jul 2010
Updated	06 Jul 2010
Type	Conference
Year	2003
Where	DRM
Authors	Özlem Uzuner, Randall Davis

Sciweavers

Content and expression-based copy recognition for intellectual property protection

Content Owners | DRM 2003 | Expression Similarities | Stylistic Fingerprints |

Explore & Download

Productivity Tools

Sciweavers