Software or hardware data cache prefetching is an efficient way to hide cache miss latency. However effectiveness of the issued prefetches have to be monitored in order to maximi...
Overlapping communication with computation is an important optimization on current cluster architectures; its importance is likely to increase as the doubling of processing power ...
Wei-Yu Chen, Dan Bonachea, Costin Iancu, Katherine...
Strassen’s matrix multiplication (MM) has benefits with respect to any (highly tuned) implementations of MM because Strassen’s reduces the total number of operations. Strasse...
Sensitivity Analysis (SA) is a novel compiler technique that complements, and integrates with, static automatic parallelization analysis for the cases when relevant program behavi...
Silvius Rus, Maikel Pennings, Lawrence Rauchwerger
Increased complexity of memory systems to ameliorate the gap between the speed of processors and memory has made it increasingly harder for compilers to optimize an arbitrary code...