Email has become an integral and sometimes overwhelming part of users’ personal and professional lives. In this paper, we measure the flow and frequency of user email toward the identification of communities of interest (COI)–groups of users that have a common bond. If detectable, such associations will be useful in automating email management, e.g., topical classification, flagging important missives, and SPAM mitigation. An analysis of a large corpus of university email is used to drive the generation and validation of algorithms for automatically determining COIs. We examine the effect of the structure and transience of COIs with the algorithms and validate algorithms using user-labeled data. Our analysis shows that the proposed algorithms correctly identify email as being sent from the human-identified COI with high accuracy. The structure and characteristics of COIs are explored analytically and broader conclusions about email use are posited.
Lisa Johansen, Michael Rowell, Kevin R. B. Butler,