Sciweavers

SDM
2011
SIAM
284views Data Mining» more  SDM 2011»
13 years 4 months ago
The Network Completion Problem: Inferring Missing Nodes and Edges in Networks
While the social and information networks have become ubiquitous, the challenge of collecting complete network data still persists. Many times the collected network data is incomp...
Myunghwan Kim 0002, Jure Leskovec
SDM
2011
SIAM
414views Data Mining» more  SDM 2011»
13 years 4 months ago
Clustered low rank approximation of graphs in information science applications
In this paper we present a fast and accurate procedure called clustered low rank matrix approximation for massive graphs. The procedure involves a fast clustering of the graph and...
Berkant Savas, Inderjit S. Dhillon
SDM
2011
SIAM
243views Data Mining» more  SDM 2011»
13 years 4 months ago
Data Integration via Constrained Clustering: An Application to Enzyme Clustering
When multiple data sources are available for clustering, an a priori data integration process is usually required. This process may be costly and may not lead to good clusterings,...
Elisa Boari de Lima, Raquel Cardoso de Melo Minard...
SDM
2011
SIAM
256views Data Mining» more  SDM 2011»
13 years 4 months ago
Temporal Structure Learning for Clustering Massive Data Streams in Real-Time
This paper describes one of the first attempts to model the temporal structure of massive data streams in real-time using data stream clustering. Recently, many data stream clust...
Michael Hahsler, Margaret H. Dunham
SDM
2011
SIAM
233views Data Mining» more  SDM 2011»
13 years 4 months ago
Distributed Monitoring of the R2 Statistic for Linear Regression
The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and one or more depe...
Kanishka Bhaduri, Kamalika Das, Chris Giannella
SDM
2011
SIAM
233views Data Mining» more  SDM 2011»
13 years 4 months ago
Multi-Instance Mixture Models
Multi-instance (MI) learning is a variant of supervised learning where labeled examples consist of bags (i.e. multi-sets) of feature vectors instead of just a single feature vecto...
James R. Foulds, Padhraic Smyth
SDM
2011
SIAM
232views Data Mining» more  SDM 2011»
13 years 4 months ago
A Sequential Dual Method for Structural SVMs
In many real world prediction problems the output is a structured object like a sequence or a tree or a graph. Such problems range from natural language processing to computationa...
Shirish Krishnaj Shevade, Balamurugan P., S. Sunda...
SDM
2011
SIAM
218views Data Mining» more  SDM 2011»
13 years 4 months ago
Segmented nestedness in binary data
A binary matrix is fully nested if its columns form a chain of subsets; that is, any two columns are ordered by the subset relation, where we view each column as a subset of the r...
Esa Junttila, Petteri Kaski
SDM
2011
SIAM
242views Data Mining» more  SDM 2011»
13 years 4 months ago
Fast Algorithms for Finding Extremal Sets
Identifying the extremal (minimal and maximal) sets from a collection of sets is an important subproblem in the areas of data-mining and satisfiability checking. For example, ext...
Roberto J. Bayardo, Biswanath Panda
SDM
2011
SIAM
198views Data Mining» more  SDM 2011»
13 years 4 months ago
Exemplar-based Robust Coherent Biclustering
The biclustering, co-clustering, or subspace clustering problem involves simultaneously grouping the rows and columns of a data matrix to uncover biclusters or sub-matrices of the...
Kewei Tu, Xixiu Ouyang, Dingyi Han, Vasant Honavar