The demand for an efficient fault tolerance system has led to the development of complex monitoring infrastructure, which in turn has created an overwhelming task of data and even...
—Developing fault management mechanisms is a difficult task because of the unpredictable nature of failures. In this paper, we present a fault simulation framework for Blue Gene...
Narayan Desai, Ewing L. Lusk, Daniel Buettner, And...
This paper describes a comprehensive prototype of large-scale fault adaptive embedded software developed for the proposed Fermilab BTeV high energy physics experiment. Lightweight...
Derek Messie, Mina Jung, Jae C. Oh, Shweta Shetty,...
High frequency digital LSIs usually consist of many subcircuits coupled with multi-conductor interconnects embedded in the substrate. They sometimes cause serious problems of the ...
Real-time access to accurate and reliable timing information is necessary to profile scientific applications, and crucial as simulations become increasingly complex, adaptive, an...
Dylan Stark, Gabrielle Allen, Tom Goodale, Thomas ...