Sciweavers

692 search results - page 89 / 139
» Balanced High Availability in Layered Distributed Computing ...
Sort
View
HPDC
2009
IEEE
14 years 2 months ago
Interconnect agnostic checkpoint/restart in open MPI
Long running High Performance Computing (HPC) applications at scale must be able to tolerate inevitable faults if they are to harness current and future HPC systems. Message Passi...
Joshua Hursey, Timothy Mattox, Andrew Lumsdaine
CCGRID
2003
IEEE
14 years 1 months ago
Performability Evaluation of Networked Storage Systems Using N-SPEK
This paper introduces a new benchmark tool for evaluating performance and availability (performability) of networked storage systems, specifically storage area network (SAN) that...
Ming Zhang, Qing Yang, Xubin He
ICDCS
2009
IEEE
14 years 5 months ago
Centaur: A Hybrid Approach for Reliable Policy-Based Routing
In this paper, we consider the design of a policy-based routing system and the role that link state might play. Looking at the problem from a link-state perspective, we propose Ce...
Xin Zhang, Adrian Perrig, Hui Zhang
EUROPAR
2009
Springer
14 years 13 days ago
Capturing and Visualizing Event Flow Graphs of MPI Applications
A high-level understanding of how an application executes and which performance characteristics it exhibits is essential in many areas of high performance computing, such as applic...
Karl Fürlinger, David Skinner
ICDCS
2008
IEEE
14 years 2 months ago
DCAR: Distributed Coding-Aware Routing in Wireless Networks
—Recently, there has been a growing interest of using network coding to improve the performance of wireless networks, for example, authors of [1] proposed the practical wireless ...
Jilin Le, John C. S. Lui, Dah-Ming Chiu