Sciweavers

482 search results - page 60 / 97
» A large-scale study of failures in high-performance computin...
Sort
View
165
Voted
HPDC
2009
IEEE
15 years 10 months ago
Interconnect agnostic checkpoint/restart in open MPI
Long running High Performance Computing (HPC) applications at scale must be able to tolerate inevitable faults if they are to harness current and future HPC systems. Message Passi...
Joshua Hursey, Timothy Mattox, Andrew Lumsdaine
136
Voted
SAC
2009
ACM
15 years 8 months ago
Latency-aware leader election
Experimental studies have shown that electing a leader based on measurements of the underlying communication network can be beneficial. We use this approach to study the problem ...
Nuno Santos, Martin Hutle, André Schiper
118
Voted
CISIM
2008
IEEE
15 years 10 months ago
Scheduling in Multiprocessor System Using Genetic Algorithms
Multiprocessors have emerged as a powerful computing means for running real-time applications, especially where a uniprocessor system would not be sufficient enough to execute all...
Keshav P. Dahal, M. Alamgir Hossain, Benny Varghes...
137
Voted
CCGRID
2003
IEEE
15 years 9 months ago
Performability Evaluation of Networked Storage Systems Using N-SPEK
This paper introduces a new benchmark tool for evaluating performance and availability (performability) of networked storage systems, specifically storage area network (SAN) that...
Ming Zhang, Qing Yang, Xubin He
123
Voted
SBACPAD
2008
IEEE
127views Hardware» more  SBACPAD 2008»
15 years 10 months ago
Measuring Operating System Overhead on CMT Processors
Numerous studies have shown that Operating System (OS) noise is one of the reasons for significant performance degradation in clustered architectures. Although many studies exami...
Petar Radojkovic, Vladimir Cakarevic, Javier Verd&...