In this paper, we describe disparity, a tool that does parallel, scalable anomaly detection for clusters. Disparity uses basic statistical methods and scalable reduction operation...
Power minimization is a serious issue in wireless sensor networks to extend the lifetime and minimize costs. However, in order to gain an accurate understanding of issues regardin...
—Developing fault management mechanisms is a difficult task because of the unpredictable nature of failures. In this paper, we present a fault simulation framework for Blue Gene...
Narayan Desai, Ewing L. Lusk, Daniel Buettner, And...
Event traces are helpful in understanding the performance behavior of message-passing applications since they allow in-depth analyses of communication and synchronization patterns...
Daniel Becker, John C. Linford, Rolf Rabenseifner,...
Chip multiprocessors (CMP) are widely used for high performance computing. Further, these CMPs are being configured in a hierarchical manner to compose a node in a cluster system....
Xingfu Wu, Valerie E. Taylor, Charles W. Lively, S...
The access patterns performed by disk-intensive applications vary widely, from simple contiguous reads or writes through an entire file to completely unpredictable random access....
Dennis Dalessandro, Ananth Devulapalli, Pete Wycko...