Co-authorship networks, an important type of social networks, have been studied extensively from various angles such as degree distribution analysis, social community extraction a...
We consider the task of performing anomaly detection in highly noisy multivariate data. In many applications involving real-valued time-series data, such as physical sensor data a...
A fundamental task of data analysis is comprehending what distinguishes clusters found within the data. We present the problem of mining distinguishing sets which seeks to find s...
In this paper, we present a framework for categorical data analysis which allows such data sets to be explored using a rich set of techniques that are only applicable to continuou...
Change-point detection is the problem of discovering time points at which properties of time-series data change. This covers a broad range of real-world problems and has been acti...
Data stream analysis frequently relies on identifying correlations and posing conditional queries on the data after it has been seen. Correlated aggregates form an important examp...
Cluster ensembles provide a framework for combining multiple base clusterings of a dataset to generate a stable and robust consensus clustering. There are important variants of th...
Correlation mining has been widely studied due to its ability for discovering the underlying occurrence dependency between objects. However, correlation mining in graph databases ...
Many real-world applications call for learning predictive relationships from multi-modal data. In particular, in multi-media and web applications, given a dataset of images and th...
Given the pairwise affinity relations associated with a set of data items, the goal of a clustering algorithm is to automatically partition the data into a small number of homogen...