Sciweavers

WWW
2003
ACM
15 years 7 days ago
Detecting Near-replicas on the Web by Content and Hyperlink Analysis
The presence of replicas or near-replicas of documents is very common on the Web. Documents may be replicated completely or partially for different reasons (versions, mirrors, etc...
Ernesto Di Iorio, Michelangelo Diligenti, Marco Go...