Black-Box Problem Diagnosis in Parallel File Systems

15 years 9 months ago

Download www.usenix.org

We focus on automatically diagnosing different performance problems in parallel file systems by identifying, gathering and analyzing OS-level, black-box performance metrics on every node in the cluster. Our peercomparison diagnosis approach compares the statistical attributes of these metrics across I/O servers, to identify the faulty node. We develop a root-cause analysis procedure that further analyzes the affected metrics to pinpoint the faulty resource (storage or network), and demonstrate that this approach works commonly across stripe-based parallel file systems. We demonstrate our approach for realistic storage and network problems injected into three different file-system benchmarks (dd, IOzone, and PostMark), in both PVFS and Lustre clusters.

Michael P. Kasick, Jiaqi Tan, Rajeev Gandhi, Priya

Real-time Traffic

Black-box Performance Metrics | FAST 2010 | Operating System | Parallel File Systems | Peercomparison Diagnosis Approach |

claim paper

» Seeing Through Black Boxes Tracking Transactions through Queues under Monitoring Resource...

» Equivalent Semantic Translation from Parallel DEVS Models to Time Automata

» Gumshoe Diagnosing Performance Problems in Replicated FileSystems

» Panda A System for Provenance and Data

» Integrating COTS Software Components into Dependable Software Architectures

» StrCombo combination of string recognizers

» Preventive Multimaster Replication in a Cluster of Autonomous Databases

» Clusterfile A Flexible Physical Layout Parallel File System

Post Info
More Details (n/a)

Added	02 Oct 2010
Updated	02 Oct 2010
Type	Conference
Year	2010
Where	FAST
Authors	Michael P. Kasick, Jiaqi Tan, Rajeev Gandhi, Priya Narasimhan

Comments (0)

Sciweavers

Black-Box Problem Diagnosis in Parallel File Systems

Black-box Performance Metrics | FAST 2010 | Operating System | Parallel File Systems | Peercomparison Diagnosis Approach |

Explore & Download

Productivity Tools

Sciweavers