Software helps people fulfill their goals, but development tools lack understanding of those goals. But if development tools did understand how software artifacts relate to higher...
Successful software systems continuously change their requirements and thus code. When this happens, some existing tests get broken because they no longer reflect the intended be...
Brett Daniel, Danny Dig, Tihomir Gvero, Vilas Jaga...
Errors in dynamic random access memory (DRAM) are a common form of hardware failure in modern compute clusters. Failures are costly both in terms of hardware replacement costs and...
In this paper, we study distributed failure diagnosis under k-bounded communication delay, where each local site transmits its observations to other sites immediately after each o...
Today’s largest High Performance Computing (HPC) systems exceed one Petaflops (1015 floating point operations per second) and exascale systems are projected within seven years...
James Elliott, Kishor Kharbas, David Fiala, Frank ...