Sciweavers

WWW
2006
ACM

Detecting nepotistic links by language model disagreement

15 years 1 months ago
Detecting nepotistic links by language model disagreement
In this short note we demonstrate the applicability of hyperlink downweighting by means of language model disagreement. The method filters out hyperlinks with no relevance to the target page without the need of white and blacklists or human interaction. We fight various forms of nepotism such as common maintainers, ads, link exchanges or misused affiliate programs. Our method is tested on a 31 M page crawl of the .de domain with a manually classified 1000-page random sample. Categories and Subject Descriptors: H.3.3 [Information Storage and Retrieval], I.7.5 [Document Capture]: Document analysis General Terms: Algorithms, Measurement, Experimentation
András A. Benczúr, István B&i
Added 22 Nov 2009
Updated 22 Nov 2009
Type Conference
Year 2006
Where WWW
Authors András A. Benczúr, István Bíró, Károly Csalogány, Máté Uher
Comments (0)