Sciweavers

51 search results - page 7 / 11
» Handling Data Skew in MapReduce
Sort
View
PVLDB
2008
124views more  PVLDB 2008»
13 years 9 months ago
Scheduling shared scans of large data files
We study how best to schedule scans of large data files, in the presence of many simultaneous requests to a common set of files. The objective is to maximize the overall rate of p...
Parag Agrawal, Daniel Kifer, Christopher Olston
PVLDB
2008
83views more  PVLDB 2008»
13 years 9 months ago
Clustera: an integrated computation and data management system
This paper introduces Clustera, an integrated computation and data management system. In contrast to traditional clustermanagement systems that target specific types of workloads,...
David J. DeWitt, Erik Paulson, Eric Robinson, Jeff...
ICDAR
1995
IEEE
14 years 1 months ago
Representation and classification of complex-shaped printed regions using white tiles
There is an increasingly pressing need to develop document analysis methods that are able to cope with images of documents containing printed regions of complex shapes. Contrary t...
Apostolos Antonacopoulos, R. T. Ritchings
ICDM
2009
IEEE
200views Data Mining» more  ICDM 2009»
13 years 7 months ago
Improving SVM Classification on Imbalanced Data Sets in Distance Spaces
Abstract--Imbalanced data sets present a particular challenge to the data mining community. Often, it is the rare event that is of interest and the cost of misclassifying the rare ...
Suzan Koknar-Tezel, Longin Jan Latecki
ICRA
2010
IEEE
158views Robotics» more  ICRA 2010»
13 years 8 months ago
Coping with imbalanced training data for improved terrain prediction in autonomous outdoor robot navigation
Abstract— Autonomous robot navigation in unstructured outdoor environments is a challenging and largely unsolved area of active research. The navigation task requires identifying...
Michael J. Procopio, Jane Mulligan, Gregory Z. Gru...