Although unstructured mesh algorithms are a popular means of solving problems across a broad range of disciplines—from texture mapping to computational fluid dynamics—they ar...
Brian S. White, Sally A. McKee, Bronis R. de Supin...
In an out-of-order issue processor, instructions are dynamically reordered and issued to function units in their dataready order rather than their original program order to achiev...
As parallel jobs get bigger in size and finer in granularity, “system noise” is increasingly becoming a problem. In fact, fine-grained jobs on clusters with thousands of SMP...
Dan Tsafrir, Yoav Etsion, Dror G. Feitelson, Scott...
Excessive power consumption is becoming a major barrier to extracting the maximum performance from high-performance parallel systems. Therefore, techniques oriented towards reduci...
Previous studies have shown that array regrouping and structure splitting significantly improve data locality. The most effective technique relies on profiling every access to eve...
Several message passing-based parallel solvers have been developed for general (non-symmetric) sparse LU factorization with partial pivoting. Due to the fine-grain synchronizatio...
Chip Multiprocessors (CMPs) are flexible, high-frequency platforms on which to support Thread-Level Speculation (TLS). However, for TLS to deliver on its promise, CMPs must explo...
Jose Renau, James Tuck, Wei Liu, Luis Ceze, Karin ...