Sciweavers

103 search results - page 9 / 21
» Comparing Massive High-Dimensional Data Sets
Sort
View
DKE
2007
88views more  DKE 2007»
13 years 7 months ago
Regression analysis for massive datasets
In this paper, a two-stage block hypothesis testing following the idea of Fan, Lin and Cheng (2004) is proposed for massive data regression analysis. Variables selection criteria ...
Tsai-Hung Fan, Dennis K. J. Lin, Kuang-Fu Cheng
CORR
2008
Springer
77views Education» more  CORR 2008»
13 years 7 months ago
Crowdsourcing, Attention and Productivity
We show through an analysis of a massive data set from YouTube that the productivity exhibited in crowdsourcing exhibits a strong positive dependence on attention, measured by the...
Bernardo A. Huberman, Daniel M. Romero, Fang Wu
KDD
2002
ACM
166views Data Mining» more  KDD 2002»
14 years 7 months ago
Frequent term-based text clustering
Text clustering methods can be used to structure large sets of text or hypertext documents. The well-known methods of text clustering, however, do not really address the special p...
Florian Beil, Martin Ester, Xiaowei Xu
DASFAA
2004
IEEE
102views Database» more  DASFAA 2004»
13 years 11 months ago
Efficient Declustering of Non-uniform Multidimensional Data Using Shifted Hilbert Curves
Abstract. Data declustering speeds up large data set retrieval by partitioning the data across multiple disks or sites and performing retrievals in parallel. Performance is determi...
Hak-Cheol Kim, Mario A. Lopez, Scott T. Leutenegge...
SSDBM
2006
IEEE
123views Database» more  SSDBM 2006»
14 years 1 months ago
Mining Hierarchies of Correlation Clusters
The detection of correlations between different features in high dimensional data sets is a very important data mining task. These correlations can be arbitrarily complex: One or...
Elke Achtert, Christian Böhm, Peer Kröge...