Accurately identifying spam campaigns launched by a large number of bots in a botnet allows for accurate spam campaign signature generation and hence is critical to defeating spamming botnets. The straight-forward approach of clustering all spam containing the same label such as an URL into a campaign can be easily defeated by techniques such as simple obfuscations of URLs. In this paper, we perform a comprehensive study of content-agnostic characteristics of spam campaigns, e.g., duration and source-network distribution of spammers, in order to ascertain whether and how they can assist the simple label-based clustering methods in identifying campaigns and generating campaign signatures. In particular, from a five-month trace collected by a relay sinkhole, we manually identified and then analyzed seven URL-based botnet spam campaigns consisting of 52 million spam messages sent over 2.09 million SMTP connections originated from over 150,000 non-proxy spamming hosts and destined to ab...
Abhinav Pathak, Feng Qian, Y. Charlie Hu, Zhuoqing