In this paper, we study probabilistic modeling of heterogeneously attributed multi-dimensional arrays. The model can manage the heterogeneity by employing an individual exponential...
An idealized clustering algorithm seeks to learn a cluster-adjacency matrix such that, if two data points belong to the same cluster, the corresponding entry would be 1; otherwise ...
The problem of finding outliers in data has broad applications in areas as diverse as data cleaning, fraud detection, network monitoring, invasive species monitoring, etc. While th...
Vit Niennattrakul, Eamonn J. Keogh, Chotirat Ann R...
In this paper, we study a new research problem of causal discovery from streaming features. A unique characteristic of streaming features is that not all features can be available ...
Ensemble clustering has emerged as an important elaboration of the classical clustering problems. Ensemble clustering refers to the situation in which a number of different (input)...
The prevailing approach to evaluating classifiers in the machine learning community involves comparing the performance of several algorithms over a series of usually unrelated data...
A city offers thousands of social events a day, and it is difficult for dwellers to make choices. The combination of mobile phones and recommender systems can change the way one de...
Daniele Quercia, Neal Lathia, Francesco Calabrese,...
Background knowledge is an important factor in privacy preserving data publishing. Probabilistic distributionbased background knowledge is a powerful kind of background knowledge w...
Raymond Chi-Wing Wong, Ada Wai-Chee Fu, Ke Wang, Y...
Citizen scientists, who are volunteers from the community that participate as field assistants in scientific studies [3], enable research to be performed at much larger spatial and...