Sciweavers

334 search results - page 9 / 67
» Fundamentals of Fault-Tolerant Distributed Computing in Asyn...
Sort
View
ET
2007
101views more  ET 2007»
13 years 7 months ago
Towards Nanoelectronics Processor Architectures
In this paper, we focus on reliability, one of the most fundamental and important challenges, in the nanoelectronics environment. For a processor architecture based on the unreliab...
Wenjing Rao, Alex Orailoglu, Ramesh Karri
ICPADS
1998
IEEE
13 years 11 months ago
Fault Tolerant All-to-All Broadcast in General Interconnection Networks
With respect to scalability and arbitrary topologies of the underlying networks in multiprogramming and multithread environment, fault tolerance in acknowledged ATAB and concurren...
Yuzhong Sun, Paul Y. S. Cheung, Xiaola Lin, Keqin ...
USENIX
1996
13 years 8 months ago
Transparent Fault Tolerance for Parallel Applications on Networks of Workstations
This paper describes a new method for providingtransparent fault tolerance for parallel applications on a network of workstations. We have designed our method in the context of sh...
Daniel J. Scales, Monica S. Lam
ISPA
2007
Springer
14 years 1 months ago
Binomial Graph: A Scalable and Fault-Tolerant Logical Network Topology
The number of processors embedded in high performance computing platforms is growing daily to solve larger and more complex problems. The logical network topologies must also suppo...
Thara Angskun, George Bosilca, Jack Dongarra
TPDS
2008
89views more  TPDS 2008»
13 years 7 months ago
Algorithm-Based Fault Tolerance for Fail-Stop Failures
Fail-stop failures in distributed environments are often tolerated by checkpointing or message logging. In this paper, we show that fail-stop process failures in ScaLAPACK matrix ...
Zizhong Chen, Jack Dongarra