Fault Detection Using Hints from the Socket Layer

14 years 8 months ago

Download homepages.di.fc.ul.pt

This paper describes a fault detection mechanism that uses the error codes returned by the stream sockets to locate process failures. Since these errors are generated automatically when there is communication with a failed process, the mechanism does not incur in any failure-free overheads. However, for some types of faults, detection can only be attained if the surviving processes use certain communication operations. To assess the coverage and latency of the proposed mechanism, faults were injected during the execution of parallel applications. Our results show that in most cases, faultscould be found using only the errors from the socket layer. Depending on the type of fault that was injected, detection occurred in an interval ranging from a few milliseconds to less than 9 minutes.

Nuno Neves, W. Kent Fuchs

Real-time Traffic

Certain Communication Operations | Detection | Fault Detection Mechanism | Operating Systems | SRDS 1997 |

claim paper

Post Info
More Details (n/a)

Added	26 Aug 2010
Updated	26 Aug 2010
Type	Conference
Year	1997
Where	SRDS
Authors	Nuno Neves, W. Kent Fuchs

Comments (0)

Sciweavers

Fault Detection Using Hints from the Socket Layer

Certain Communication Operations | Detection | Fault Detection Mechanism | Operating Systems | SRDS 1997 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers