Sciweavers

463 search results - page 6 / 93
» Accuracy Estimation With Clustered Dataset
Sort
View
ICDE
2008
IEEE
141views Database» more  ICDE 2008»
14 years 8 months ago
A General Framework for Fast Co-clustering on Large Datasets Using Matrix Decomposition
Abstract-- Simultaneously clustering columns and rows (coclustering) of large data matrix is an important problem with wide applications, such as document mining, microarray analys...
Feng Pan, Xiang Zhang, Wei Wang 0010
SIGMOD
2008
ACM
157views Database» more  SIGMOD 2008»
14 years 7 months ago
CRD: fast co-clustering on large datasets utilizing sampling-based matrix decomposition
The problem of simultaneously clustering columns and rows (coclustering) arises in important applications, such as text data mining, microarray analysis, and recommendation system...
Feng Pan, Xiang Zhang, Wei Wang 0010
KDD
2009
ACM
227views Data Mining» more  KDD 2009»
14 years 7 months ago
Efficiently learning the accuracy of labeling sources for selective sampling
Many scalable data mining tasks rely on active learning to provide the most useful accurately labeled instances. However, what if there are multiple labeling sources (`oracles...
Pinar Donmez, Jaime G. Carbonell, Jeff Schneider
IJHPCA
2007
88views more  IJHPCA 2007»
13 years 7 months ago
Scaling Properties of Common Statistical Operators for Gridded Datasets
An accurate cost-model that accounts for dataset size and structure can help optimize geoscience data analysis. We develop and apply a computational model to estimate data analysi...
Charles S. Zender, Harry Mangalam
ALGORITHMICA
2006
139views more  ALGORITHMICA 2006»
13 years 7 months ago
CONQUEST: A Coarse-Grained Algorithm for Constructing Summaries of Distributed Discrete Datasets
Abstract. In this paper we present a coarse-grained parallel algorithm, CONQUEST, for constructing boundederror summaries of high-dimensional binary attributed data in a distribute...
Jie Chi, Mehmet Koyutürk, Ananth Grama