Sciweavers

114 search results - page 21 / 23
» Speculative Parallelization - Eliminating the Overhead of Fa...
Sort
View
GRID
2006
Springer
13 years 7 months ago
Operating System Support for Space Allocation in Grid Storage Systems
Abstract-- Shared temporary storage space is often the constraining resource for clusters that serve as execution nodes in wide-area distributed systems. At least one large nationa...
Douglas Thain
ICDCS
2012
IEEE
11 years 10 months ago
PREPARE: Predictive Performance Anomaly Prevention for Virtualized Cloud Systems
Abstract—Virtualized cloud systems are prone to performance anomalies due to various reasons such as resource contentions, software bugs, and hardware failures. In this paper, we...
Yongmin Tan, Hiep Nguyen, Zhiming Shen, Xiaohui Gu...
IPPS
2010
IEEE
13 years 5 months ago
Supporting fault tolerance in a data-intensive computing middleware
Over the last 2-3 years, the importance of data-intensive computing has increasingly been recognized, closely coupled with the emergence and popularity of map-reduce for developin...
Tekin Bicer, Wei Jiang, Gagan Agrawal
ICS
2007
Tsinghua U.
14 years 1 months ago
Proactive fault tolerance for HPC with Xen virtualization
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current t...
Arun Babu Nagarajan, Frank Mueller, Christian Enge...
CLOUD
2010
ACM
14 years 21 days ago
A self-organized, fault-tolerant and scalable replication scheme for cloud storage
Failures of any type are common in current datacenters, partly due to the higher scales of the data stored. As data scales up, its availability becomes more complex, while differe...
Nicolas Bonvin, Thanasis G. Papaioannou, Karl Aber...