Sciweavers

207 search results - page 17 / 42
» High accuracy failure injection in parallel and distributed ...
Sort
View
HPDC
1998
IEEE
13 years 12 months ago
A Fault Detection Service for Wide Area Distributed Computations
The potential for faults in distributed computing systems is a significant complicating factor for application developers. While a variety of techniques exist for detecting and co...
Paul Stelling, Ian T. Foster, Carl Kesselman, Crai...
ICDCS
2009
IEEE
14 years 4 months ago
FLASH: Fine-Grained Localization in Wireless Sensor Networks Using Acoustic Sound Transmissions and High Precision Clock Synchro
Sensor localization in wireless sensor networks is an important component of many applications. Previous work has demonstrated how localization can be achieved using various metho...
Evangelos Mangas, Angelos Bilas
ICS
2000
Tsinghua U.
13 years 11 months ago
Improving parallel system performance by changing the arrangement of the network links
The Midimew network is an excellent contender for implementing the communication subsystem of a high performance computer. This network is an optimal 2D topology in the sense ther...
Valentin Puente, Cruz Izu, José A. Gregorio...
ICDCS
2007
IEEE
14 years 2 months ago
Fault Tolerance in Multiprocessor Systems Via Application Cloning
Record and Replay (RR) is a software based state replication solution designed to support recording and subsequent replay of the execution of unmodified applications running on mu...
Philippe Bergheaud, Dinesh Subhraveti, Marc Vertes
ICS
1999
Tsinghua U.
13 years 12 months ago
Application scaling under shared virtual memory on a cluster of SMPs
In this paper we examine how application performance scales on a state-of-the-art shared virtual memory (SVM) system on a cluster with 64 processors, comprising 4-way SMPs connect...
Dongming Jiang, Brian O'Kelley, Xiang Yu, Sanjeev ...