Email is a key communication tool for collaborative workgroups. In this paper, we investigate how team leadership roles can be inferred from a collection of email messages exchang...
In this paper, we propose a new asymmetric boosting method, Boosting with Different Costs. Traditional boosting methods assume the same cost for misclassified instances from di...
Email protocols were designed to be flexible and forgiving, designed in a day when Internet usage was a cooperative thing. A side effect of that is that they were not designed to ...
In contextual computing, where cues beyond direct user input are used to trigger computation, one of the most daunting challenges is inferring what the user is doing. For the doma...
Victoria Bellotti, Jim Thornton, Alvin Chin, Diane...
To evade blacklisting, the vast majority of spam email is sent from exploited MTAs (i.e., botnets) and with forged “From” addresses. In response, the anti-spam community has d...
Chris Fleizach, Geoffrey M. Voelker, Stefan Savage
Email has become an integral and sometimes overwhelming part of users’ personal and professional lives. In this paper, we measure the flow and frequency of user email toward th...
Lisa Johansen, Michael Rowell, Kevin R. B. Butler,...
Near-duplicate detection is not only an important pre and post processing task in Information Retrieval but also an effective spam-detection technique. Among different approache...
Many of the first successful applications of statistical learning to anti-spam filtering were personalized classifiers that were trained on an individual user’s spam and ham ...
Web spam research has been hampered by a lack of statistically significant collections. In this paper, we perform the first large-scale characterization of web spam using conten...
We show how a game-theoretic model of spam e-mailing, which we had introduced in previous work, can be extended to include the possibility of employing Human Interactive Proofs (h...
Dimitrios K. Vassilakis, Ion Androutsopoulos, Evan...