We present Nodeinfo, an unsupervised algorithm for anomaly detection in system logs. We demonstrate Nodeinfo’s effectiveness on data from four of the world’s most powerful sup...
In this paper, we propose a new variant of Latent Dirichlet Allocation(LDA): Collective LDA (C-LDA), for multiple corpora modeling. C-LDA combines multiple corpora during learning...
A common task of recommender systems is to improve customer experience through personalized recommendations based on prior implicit feedback. These systems passively track differe...
This paper presents a data oriented approach to modeling the complex computing systems, in which an ensemble of correlation models are discovered to represent the system status. I...
Zero-norm, defined as the number of non-zero elements in a vector, is an ideal quantity for feature selection. However, minimization of zero-norm is generally regarded as a combi...
We study the problem of clustering uncertain objects whose locations are described by probability density functions (pdf). We show that the UK-means algorithm, which generalises t...
Ben Kao, Sau Dan Lee, David W. Cheung, Wai-Shing H...
Since Jim Gray introduced the concept of ”data cube” in 1997, data cube, associated with online analytical processing (OLAP), has become a driving engine in data warehouse ind...
In recent years, co-clustering has emerged as a powerful data mining tool that can analyze dyadic data connecting two entities. However, almost all existing co-clustering techniqu...