The proliferation of malware has presented a serious threat to the security of computer systems. Traditional signature-based antivirus systems fail to detect polymorphic and new, ...
We consider the problem of detecting anomalies in high arity categorical datasets. In most applications, anomalies are defined as data points that are 'abnormal'. Quite ...
We address the task of learning rankings of documents from search engine logs of user behavior. Previous work on this problem has relied on passively collected clickthrough data. ...
How can we find communities in dynamic networks of social interactions, such as who calls whom, who emails whom, or who sells to whom? How can we spot discontinuity timepoints in ...
A wide variety of machine learning problems can be described as minimizing a regularized risk functional, with different algorithms using different notions of risk and different r...
Choon Hui Teo, Alex J. Smola, S. V. N. Vishwanatha...
Previous work on text mining has almost exclusively focused on a single stream. However, we often have available multiple text streams indexed by the same set of time points (call...
Xuanhui Wang, ChengXiang Zhai, Xiao Hu, Richard Sp...
This work studies the problem of distributed classification in peer-to-peer (P2P) networks. While there has been a significant amount of work in distributed classification, most o...