— Large parallel computers require techniques to tolerate the potentially large latencies of accessing remote data. Multithreadingis onesuch technique. We extend previous studies...
Henk L. Muller, Paul W. A. Stallard, David H. D. W...
Explicitly Parallel Instruction Computing (EPIC) architectures require the compiler to express program instruction level parallelism directly to the hardware. EPIC techniques whic...
David I. August, Daniel A. Connors, Scott A. Mahlk...
Instruction set extensions (ISEs) can accelerate embedded processor performance. Many algorithms for ISE generation have shown good potential; some of them have recently been expa...
Theo Kluter, Philip Brisk, Paolo Ienne, Edoardo Ch...
The pervasive use of pointers with complicated patterns in C programs often constrains compiler alias analysis to yield conservative register allocation and promotion. Speculative...
We show empirically that some of the issues that affected the design of linear algebra libraries for distributed memory architectures will also likely affect such libraries for s...
Bryan Marker, Field G. Van Zee, Kazushige Goto, Gr...