While the social and information networks have become ubiquitous, the challenge of collecting complete network data still persists. Many times the collected network data is incomp...
In this paper we present a fast and accurate procedure called clustered low rank matrix approximation for massive graphs. The procedure involves a fast clustering of the graph and...
When multiple data sources are available for clustering, an a priori data integration process is usually required. This process may be costly and may not lead to good clusterings,...
Elisa Boari de Lima, Raquel Cardoso de Melo Minard...
This paper describes one of the first attempts to model the temporal structure of massive data streams in real-time using data stream clustering. Recently, many data stream clust...
The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and one or more depe...
Multi-instance (MI) learning is a variant of supervised learning where labeled examples consist of bags (i.e. multi-sets) of feature vectors instead of just a single feature vecto...
In many real world prediction problems the output is a structured object like a sequence or a tree or a graph. Such problems range from natural language processing to computationa...
Shirish Krishnaj Shevade, Balamurugan P., S. Sunda...
A binary matrix is fully nested if its columns form a chain of subsets; that is, any two columns are ordered by the subset relation, where we view each column as a subset of the r...
Identifying the extremal (minimal and maximal) sets from a collection of sets is an important subproblem in the areas of data-mining and satisfiability checking. For example, ext...
The biclustering, co-clustering, or subspace clustering problem involves simultaneously grouping the rows and columns of a data matrix to uncover biclusters or sub-matrices of the...