The recent years have witnessed a surge of interests of semi-supervised clustering methods, which aim to cluster the data set under the guidance of some supervisory information. U...
Network security has been a serious concern for many years. For example, firewalls often record thousands of exploit attempts on a daily basis. Network administrators could benefi...
Jian Zhang 0004, Phillip A. Porras, Johannes Ullri...
The Universum data, defined as a collection of "nonexamples" that do not belong to any class of interest, have been shown to encode some prior knowledge by representing ...
Dan Zhang, Jingdong Wang, Fei Wang, Changshui Zhan...
Sample selection bias is a common problem in many real world applications, where training data are obtained under realistic constraints that make them follow a different distribut...
We derive a number of well known deterministic latent variable models such as PCA, ICA, EPCA, NMF and PLSA as variational EM approximations with point posteriors. We show that the...
Max Welling, Chaitanya Chemudugunta, Nathan Sutter
In this paper we explore private computation built on vector addition and its applications in privacypreserving data mining. Vector addition is a surprisingly general tool for imp...
In this paper we propose and test the use of hierarchical clustering for feature selection. The clustering method is Ward's with a distance measure based on GoodmanKruskal ta...
Transductive inference on graphs such as label propagation algorithms is receiving a lot of attention. In this paper, we address a label propagation problem on multiple networks a...