Sciweavers

PVLDB
2011
13 years 7 months ago
Column-Oriented Storage Techniques for MapReduce
Users of MapReduce often run into performance problems when they scale up their workloads. Many of the problems they encounter can be overcome by applying techniques learned from ...
Avrilia Floratou, Jignesh M. Patel, Eugene J. Shek...
IPPS
2010
IEEE
13 years 10 months ago
BlobSeer: Bringing high throughput under heavy concurrency to Hadoop Map-Reduce applications
Hadoop is a software framework supporting the Map/Reduce programming model. It relies on the Hadoop Distributed File System (HDFS) as its primary storage system. The efficiency of ...
Bogdan Nicolae, Diana Moise, Gabriel Antoniu, Luc ...
CLOUDCOM
2010
Springer
13 years 10 months ago
Attaching Cloud Storage to a Campus Grid Using Parrot, Chirp, and Hadoop
The Hadoop filesystem is a large scale distributed filesystem used to manage and quickly process extremely large data sets. We want to utilize Hadoop to assist with dataintensive ...
Patrick Donnelly, Peter Bui, Douglas Thain
SIGOPS
2010
124views more  SIGOPS 2010»
13 years 11 months ago
Decoupling storage and computation in Hadoop with SuperDataNodes
The rise of ad-hoc data-intensive computing has led to the development of data-parallel programming systems such as Map/Reduce and Hadoop, which achieve scalability by tightly cou...
George Porter
PVLDB
2010
167views more  PVLDB 2010»
13 years 11 months ago
The Performance of MapReduce: An In-depth Study
MapReduce has been widely used for large-scale data analysis in the Cloud. The system is well recognized for its elastic scalability and fine-grained fault tolerance although its...
Dawei Jiang, Beng Chin Ooi, Lei Shi, Sai Wu
PVLDB
2010
178views more  PVLDB 2010»
13 years 11 months ago
Hadoop++: Making a Yellow Elephant Run Like a Cheetah (Without It Even Noticing)
MapReduce is a computing paradigm that has gained a lot of attention in recent years from industry and research. Unlike parallel DBMSs, MapReduce allows non-expert users to run co...
Jens Dittrich, Jorge-Arnulfo Quiané-Ruiz, A...

0
posts
with
0
views
141profile views
corningStudy Group
student
corning
SIGCSE
2008
ACM
211views Education» more  SIGCSE 2008»
14 years 12 days ago
Cluster computing for web-scale data processing
In this paper we present the design of a modern course in cluster computing and large-scale data processing. The defining differences between this and previously published designs...
Aaron Kimball, Sierra Michels-Slettvet, Christophe...
HPDC
2010
IEEE
14 years 1 months ago
Improving the Hadoop map/reduce framework to support concurrent appends through the BlobSeer BLOB management system
Hadoop is a reference software framework supporting the Map/Reduce programming model. It relies on the Hadoop Distributed File System (HDFS) as its primary storage system. Althoug...
Diana Moise, Gabriel Antoniu, Luc Bougé
CLOUD
2010
ACM
14 years 5 months ago
Making cloud intermediate data fault-tolerant
Parallel dataflow programs generate enormous amounts of distributed data that are short-lived, yet are critical for completion of the job and for good run-time performance. We ca...
Steven Y. Ko, Imranul Hoque, Brian Cho, Indranil G...