Sciweavers

1113 search results - page 13 / 223
» Performance under Failures of DAG-based Parallel Computing
Sort
View
ICPP
2007
IEEE
14 years 1 months ago
Fault-Driven Re-Scheduling For Improving System-level Fault Resilience
The productivity of HPC system is determined not only by their performance, but also by their reliability. The conventional method to limit the impact of failures is checkpointing...
Yawei Li, Prashasta Gujrati, Zhiling Lan, Xian-He ...
DSN
2008
IEEE
14 years 1 months ago
Byzantine replication under attack
Existing Byzantine-resilient replication protocols satisfy two standard correctness criteria, safety and liveness, in the presence of Byzantine faults. In practice, however, fault...
Yair Amir, Brian A. Coan, Jonathan Kirsch, John La...
JSSPP
2001
Springer
13 years 11 months ago
Coscheduling under Memory Constraints in a NOW Environment
Networks of Workstations (NOW) have become important and cost-effective parallel platforms for scientific computations. In practice, a NOW system is heterogeneous and non-dedicat...
Francesc Giné, Francesc Solsona, Porfidio H...
IPPS
2005
IEEE
14 years 29 days ago
Improvement of Power-Performance Efficiency for High-End Computing
Left unchecked, the fundamental drive to increase peak performance using tens of thousands of power hungry components will lead to intolerable operating costs and failure rates. R...
Rong Ge, Xizhou Feng, Kirk W. Cameron
IPPS
2010
IEEE
13 years 4 months ago
Scalable parallel I/O alternatives for massively parallel partitioned solver systems
Abstract--With the development of high-performance computing, I/O issues have become the bottleneck for many massively parallel applications. This paper investigates scalable paral...
Jing Fu, Ning Liu, Onkar Sahni, Kenneth E. Jansen,...