Sciweavers

51 search results - page 4 / 11
» Handling Data Skew in MapReduce
Sort
View
CORR
2010
Springer
124views Education» more  CORR 2010»
13 years 8 months ago
A New Framework for Join Product Skew
Different types of data skewness can result in load imbalance in the context of parallel joins under the shared nothing architecture. We study one important type of skewness, join ...
Foto N. Afrati, Victor Kyritsis, Paraskevas V. Lek...
DMIN
2007
186views Data Mining» more  DMIN 2007»
13 years 10 months ago
Cost-Sensitive Learning vs. Sampling: Which is Best for Handling Unbalanced Classes with Unequal Error Costs?
- The classifier built from a data set with a highly skewed class distribution generally predicts the more frequently occurring classes much more often than the infrequently occurr...
Gary M. Weiss, Kate McCarthy, Bibi Zabar
DICTA
2009
13 years 9 months ago
Multivariate Skew t Mixture Models: Applications to Fluorescence-Activated Cell Sorting Data
In many applied problems in the context of pattern recognition, the data often involve highly asymmetric observations. Normal mixture models tend to overfit when additional compone...
Kui Wang, Shu-Kay Ng, Geoffrey J. McLachlan
SDM
2007
SIAM
140views Data Mining» more  SDM 2007»
13 years 10 months ago
A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions
In recent years, there have been some interesting studies on predictive modeling in data streams. However, most such studies assume relatively balanced and stable data streams but...
Jing Gao, Wei Fan, Jiawei Han, Philip S. Yu
ICDE
2005
IEEE
92views Database» more  ICDE 2005»
14 years 2 months ago
The Versioning System Balancing Data Amount and Access Frequency on Distributed Storage System
In this paper, a method of handling both access frequency skew and data amount skew on a distributed parallel storage system under version management system is discussed. We assum...
Mana Nakano, Dai Kobayashi, Akitsugu Watanabe, Tos...