We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
The paper describes some innovations related to the ongoing work on the GSA prototype, an integrated information retrieval agent. In order to improve the original system effective...
Giovambattista Ianni, Francesco Ricca, Francesco C...
Automatic extraction of semantic information from text and links in Web pages is key to improving the quality of search results. However, the assessment of automatic semantic meas...
Ana Gabriela Maguitman, Filippo Menczer, Heather R...
Today many interactions are carried out online through Web sites and e-services and often private and/or sensitive information is required by service providers. A growing concern r...
Claudio Agostino Ardagna, Marco Cremonini, Ernesto...
Nowadays web spamming has emerged to take the economic advantage of high search rankings and threatened the accuracy and fairness of those rankings. Understanding spamming techniq...