Email is an increasingly important and ubiquitous means of communication, both facilitating contact between individuals and enabling rises in the productivity of organizations. However, the relentless rise of automatic unauthorized emails, a.k.a. spam is eroding away much of the attractiveness of email communication. Most of the attention dedicated to spam detection has focused on the content of the emails or on the addresses or domains associated with spam senders. Although, methods based on these - easily changeable - identifiers work reasonably well, they miss on the fundamental nature of spam as an opportunistic relationship, very different from the normal mutual relations between senders and recipients of legitimate email. Here we present a comprehensive graph theoretical analysis of email traffic that captures these properties quantitatively. We identify several simple metrics that serve both to distinguish between spam and legitimate email and to provide a statistical basis f...
Luíz Henrique Gomes, Rodrigo B. Almeida, Lu