Sciweavers

58 search results - page 7 / 12
» A global operating system for HPC clusters
Sort
View
CLUSTER
2006
IEEE
13 years 10 months ago
Improving Communication Performance on InfiniBand by Using Efficient Data Placement Strategies
Despite using high-speed network interconnection systems like InfiniBand, the communication overhead for parallel applications is still high. In this paper we show, how such costs...
Robert Rex, Frank Mietke, Wolfgang Rehm, Christoph...
IPPS
2010
IEEE
13 years 4 months ago
Designing high-performance and resilient message passing on InfiniBand
Abstract--Clusters featuring the InfiniBand interconnect are continuing to scale. As an example, the "Ranger" system at the Texas Advanced Computing Center (TACC) include...
Matthew J. Koop, Pavel Shamis, Ishai Rabinovitz, D...
IPPS
1998
IEEE
13 years 11 months ago
Migration and Rollback Transparency for Arbitrary Distributed Applications in Workstation Clusters
Programmers and users of compute intensive scientific applications often do not want to (or even cannot) code load balancing and fault tolerance into their programs. The PBEAM syst...
Stefan Petri, Matthias Bolz, Horst Langendörf...
RTS
2006
99views more  RTS 2006»
13 years 6 months ago
Combination of clock-state and clock-rate correction in fault-tolerant distributed systems
This paper proposes the integration of internal and external clock synchronization by a combination of a fault-tolerant distributed algorithm for clock state correction with a cent...
Hermann Kopetz, Astrit Ademaj, Alexander Hanzlik
COMSWARE
2007
IEEE
14 years 1 months ago
Software Architecture for Dynamic Thermal Management in Datacenters
Abstract— Minimizing the energy cost and improving thermal performance of power-limited datacenters, deploying large computing clusters, are the key issues towards optimizing the...
Tridib Mukherjee, Qinghui Tang, Corbett Ziesman, S...