Analyzing the author and topic relations in email corpus is an important issue in both social network analysis and text mining. The AuthorTopic model is a statistical model that id...
We introduce a novel approach to incremental e-mail categorization based on identifying and exploiting "clumps" of messages that are classified similarly. Clumping reflec...
Handling missing data is a critical step to ensuring good results in data mining. Like most data mining algorithms, existing privacy-preserving data mining algorithms assume data ...
In many software development projects, people tend to repeat same mistakes due to lack of shared knowledge from past experiences. Generally, it is very difficult to manually find ...
Abstract. Spam is serious problem that affects email users (e.g. phishing attacks, viruses and time spent reading unwanted messages). We propose a novel spam email filtering appr...