With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...
Martin Theobald, Jonathan Siddharth, Andreas Paepc...
Email spam is a much studied topic, but even though current email spam detecting software has been gaining a competitive edge against text based email spam, new advances in spam g...
A new trend in email spam is the emergence of image spam. Although current anti-spam technologies are quite successful in filtering text-based spam emails, the new image spams ar...
Zhe Wang, William K. Josephson, Qin Lv, Moses Char...
Nowadays, the issue of near-duplicate video matching has been extensively studied. However, transformation, which is one of the major causes of near-duplicates, has been little di...