Sciweavers

IPPS
1997
IEEE

A Reliable Hardware Barrier Synchronization Scheme

14 years 4 months ago
A Reliable Hardware Barrier Synchronization Scheme
Barrier synchronization is a crucial operation for parallel systems. Many schemes have been proposed in the literature to achieve fast barrier synchronization through software, hardware, or a combination of these mechanisms. However, few of these schemes emphasize fault-tolerant barrier operations. In this paper, we describe inexpensive support that can be added to network switches for achieving reliable hardware-based barrier synchronization while recovering from lost or corrupted messages. Necessary modifications to the switch architecture and the associated fault-tolerant message-passing protocols are presented. The protocols are optimized for the no-fault case while providing means to detect the failure of any step of the operation and to recover from it. The proposed scheme shows significant potential for use in parallel systems, especially the emerging systems based on networks of workstations.
Rajeev Sivaram, Craig B. Stunkel, Dhabaleswar K. P
Added 06 Aug 2010
Updated 06 Aug 2010
Type Conference
Year 1997
Where IPPS
Authors Rajeev Sivaram, Craig B. Stunkel, Dhabaleswar K. Panda
Comments (0)