We investigate operating system noise, which we identify as one of the main reasons for a lack of synchronicity in parallel applications. Using a microbenchmark, we measure the no...
Peter H. Beckman, Kamil Iskra, Kazutomo Yoshii, Su...
The use of several distinct recovery procedures is one of the techniques that can be used to ensure high availability and fault-tolerance of computer systems. This method has been...
Sergiy A. Vilkomir, David Lorge Parnas, Veena B. M...
Key issues to address in autonomic job recovery for cluster computing are recognizing job failure; understanding the failure sufficiently to know if and how to restart the job; an...
Charles Earl, Emilio Remolina, Jim Ong, John Brown...
The HPC Challenge (HPCC) benchmark suite and the Intel MPI Benchmark (IMB) are used to compare and evaluate the combined performance of processor, memory subsystem and interconnec...
Subhash Saini, Robert Ciotti, Brian T. N. Gunney, ...
Commodity symmetric multiprocessors (SMPs), though originally intended for transaction processing, because of their availability, are now used for numerical analysis applications ...