Sciweavers

281 search results - page 4 / 57
» Implementations of Randomized Sorting on Large Parallel Mach...
Sort
View
SIGMOD
2007
ACM
190views Database» more  SIGMOD 2007»
14 years 7 months ago
Map-reduce-merge: simplified relational data processing on large clusters
Map-Reduce is a programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. Through ...
Hung-chih Yang, Ali Dasdan, Ruey-Lung Hsiao, Dougl...
IPPS
2007
IEEE
14 years 1 months ago
Optimizing Sorting with Machine Learning Algorithms
The growing complexity of modern processors has made the development of highly efficient code increasingly difficult. Manually developing highly efficient code is usually expen...
Xiaoming Li, María Jesús Garzar&aacu...
CORR
2010
Springer
143views Education» more  CORR 2010»
13 years 7 months ago
Parallel Sorted Neighborhood Blocking with MapReduce
: Cloud infrastructures enable the efficient parallel execution of data-intensive tasks such as entity resolution on large datasets. We investigate challenges and possible solution...
Lars Kolb, Andreas Thor, Erhard Rahm
PODS
2005
ACM
164views Database» more  PODS 2005»
14 years 7 months ago
Lower bounds for sorting with few random accesses to external memory
We consider a scenario where we want to query a large dataset that is stored in external memory and does not fit into main memory. The most constrained resources in such a situati...
Martin Grohe, Nicole Schweikardt
DOLAP
2007
ACM
13 years 11 months ago
Optimal chunking of large multidimensional arrays for data warehousing
ss domain. Using this more abstract approach means that more data sources of varying types can be incorporated with less effort, and such heterogeneous data sources might be very r...
Ekow J. Otoo, Doron Rotem, Sridhar Seshadri